October 20, 202512 min readBy Prithivi Kai Balaji

PostgreSQL Bulk Insert Performance Analysis: A Comprehensive Comparison

PostgreSQLPerformanceNpgsql.NETBulk InsertDatabase Optimization


PostgreSQL Bulk Insert Performance Analysis: A Comprehensive Comparison


Introduction

When working with PostgreSQL and .NET applications, choosing the right data insertion strategy can significantly impact performance, especially when dealing with large datasets. This article presents a comprehensive performance analysis comparing different PostgreSQL insertion methods using Npgsql, including single inserts, batch inserts, bulk copy operations, and prepared statements.


Test Environment

  • Database: PostgreSQL
  • Client: .NET 8 with Npgsql driver
  • Test Data: 1,000,000 records with complex JSONB data
  • Record Structure: Each record includes name, email, age, salary, and a complex ProfileData object stored as JSONB


Sample Data Structure


CSHARP
public class TestRecord
{
    public string Name { get; set; }
    public string Email { get; set; }
    public int Age { get; set; }
    public decimal Salary { get; set; }
    public ProfileData ProfileData { get; set; }
}

public class ProfileData
{
    public string Department { get; set; }
    public string Position { get; set; }
    public List<string> Skills { get; set; }
    public Dictionary<string, object> Preferences { get; set; }
    public DateTime LastLogin { get; set; }
    public bool IsActive { get; set; }
}


Performance Test Results

Summary Table

MethodRecords/secTotal Time (1M records)Performance Ratio
Single Inserts3,786264 seconds1x (baseline)
Prepared Statements4,986200 seconds1.3x
Batch Inserts44,65122 seconds11.8x
🏆 Bulk Copy (COPY)119,2468 seconds31.5x


Detailed Analysis


1. Single Inserts

  • Performance: 3,786 records/sec
  • Total Time: 264 seconds (4.4 minutes)
  • Method: Individual INSERT statements with parameterized queries
  • Use Case: Small datasets, real-time applications
  • Pros: Simple, reliable, good for small volumes
  • Cons: Extremely slow for large datasets, high network overhead


CSHARP
foreach (var record in testData)
{
    await using var command = new NpgsqlCommand(@"
        INSERT INTO performance_test (name, email, age, salary, profile_data) 
        VALUES (@name, @email, @age, @salary, @profile_data::jsonb)", connection);
    
    command.Parameters.AddWithValue("@name", record.Name);
    command.Parameters.AddWithValue("@email", record.Email);
    command.Parameters.AddWithValue("@age", record.Age);
    command.Parameters.AddWithValue("@salary", record.Salary);
    command.Parameters.AddWithValue("@profile_data", JsonSerializer.Serialize(record.ProfileData));
    
    await command.ExecuteNonQueryAsync();
}

2. Prepared Statements

  • Performance: 4,986 records/sec
  • Total Time: 200 seconds (3.3 minutes)
  • Method: Pre-compiled SQL statements with parameter reuse
  • Use Case: Repeated similar operations
  • Pros: Slight performance improvement over single inserts
  • Cons: Still slow for bulk operations, limited benefit for one-time operations


CSHARP
await using var command = new NpgsqlCommand(@"
    INSERT INTO performance_test (name, email, age, salary, profile_data) 
    VALUES (@name, @email, @age, @salary, @profile_data::jsonb)", connection);

await command.PrepareAsync();

foreach (var record in testData)
{
    command.Parameters["@name"].Value = record.Name;
    command.Parameters["@email"].Value = record.Email;
    command.Parameters["@age"].Value = record.Age;
    command.Parameters["@salary"].Value = record.Salary;
    command.Parameters["@profile_data"].Value = JsonSerializer.Serialize(record.ProfileData);
    
    await command.ExecuteNonQueryAsync();
}

3. Bulk Inserts

  • Performance: 44,651 records/sec
  • Total Time: 22 seconds
  • Method: Multiple VALUES clauses in single INSERT statement
  • Use Case: Medium to large datasets, balanced approach
  • Pros: Significant performance improvement, good balance of speed and complexity
  • Cons: Memory usage increases with batch size, parameter limit constraints


CSHARP
const int batchSize = 1000;
for (int i = 0; i < testData.Count; i += batchSize)
{
    var batch = testData.Skip(i).Take(batchSize).ToList();
    
    var valueSets = string.Join(",", batch.Select((record, index) => 
        $"(@name{index}, @email{index}, @age{index}, @salary{index}, @profile_data{index}::jsonb)"));
    
    await using var command = new NpgsqlCommand($@"
        INSERT INTO performance_test (name, email, age, salary, profile_data) 
        VALUES {valueSets}", connection);
    
    // Add parameters for each record in the batch
    for (int j = 0; j < batch.Count; j++)
    {
        var record = batch[j];
        command.Parameters.AddWithValue($"@name{j}", record.Name);
        command.Parameters.AddWithValue($"@email{j}", record.Email);
        command.Parameters.AddWithValue($"@age{j}", record.Age);
        command.Parameters.AddWithValue($"@salary{j}", record.Salary);
        command.Parameters.AddWithValue($"@profile_data{j}", JsonSerializer.Serialize(record.ProfileData));
    }
    
    await command.ExecuteNonQueryAsync();
}

4. Bulk Copy (COPY Command) - Winner! 🏆

  • Performance: 119,246 records/sec
  • Total Time: 8 seconds
  • Method: PostgreSQL's native COPY command with binary format
  • Use Case: Large datasets, data migration, ETL processes
  • Pros: Fastest method, minimal network overhead, optimized for bulk operations
  • Cons: More complex implementation, requires proper JSONB handling


CSHARP
using var writer = await connection.BeginBinaryImportAsync(
    "COPY performance_test (name, email, age, salary, profile_data) FROM STDIN (FORMAT BINARY)");

foreach (var record in testData)
{
    await writer.StartRowAsync();
    await writer.WriteAsync(record.Name);
    await writer.WriteAsync(record.Email);
    await writer.WriteAsync(record.Age);
    await writer.WriteAsync(record.Salary);
    await writer.WriteAsync(JsonSerializer.Serialize(record.ProfileData), NpgsqlDbType.Jsonb);
}

await writer.CompleteAsync();


Key Findings


1. Performance Hierarchy

The COPY command significantly outperforms all other methods:

  • 31.5x faster

    than single inserts

  • 2.7x faster

    than bulk inserts

  • 24x faster

    than prepared statements


2. JSONB Handling Challenges

Working with JSONB data requires special consideration:

Serialization

Objects must be serialized to JSON strings

Type Casting

Use ::jsonb casting in SQL or NpgsqlDbType.Jsonb in binary operations

Binary Format

COPY command with binary format provides best performance for JSONB


3. Memory vs. Performance Trade-offs

  • Single Inserts

    Low memory usage, poor performance

  • Bulk Inserts

    Moderate memory usage, good performance

  • COPY Command

    Higher memory usage, excellent performance


Recommendations


When to Use Each Method

1. Single Inserts

  • • Real-time applications with small data volumes
  • • When simplicity is more important than performance
  • • Interactive applications with user input


2. Prepared Statements

  • • Repeated operations with similar structure
  • • When you need slight performance improvement over single inserts
  • • Applications with predictable query patterns


3. Batch Inserts

  • • Medium to large datasets (10K - 1M records)
  • • When you need good performance without complexity
  • • Applications requiring transaction control per batch


4. COPY Command

  • • Large datasets (100K+ records)
  • • Data migration and ETL processes
  • • When maximum performance is required
  • • Bulk data loading scenarios


Best Practices

  1. Choose the Right Method: Match the insertion method to your data volume and performance requirements
  2. Handle JSONB Properly: Always serialize complex objects and use appropriate type casting
  3. Batch Size Optimization: For bulk inserts, experiment with batch sizes (1000-5000 records typically work well)
  4. Connection Management: Use connection pooling for better performance
  5. Error Handling: Implement proper error handling and retry logic for production applications


Conclusion

The performance analysis clearly demonstrates that PostgreSQL's COPY command is the optimal choice for bulk data insertion, offering 31.5x better performance than single inserts. However, the choice of insertion method should be based on your specific requirements:


  • For maximum performance: Use COPY command
  • For balanced approach: Use bulk inserts
  • For simplicity: Use single inserts
  • For repeated operations: Use prepared statements


Understanding these performance characteristics and implementing the appropriate method for your use case can significantly improve your application's data insertion performance and overall user experience.


This analysis was conducted using .NET 8, Npgsql driver, and PostgreSQL. Results may vary based on hardware configuration, network conditions, and database settings.


Performance Results Summary

31.5x
COPY vs Single Inserts
119,246
Records/sec (COPY)
8 seconds
1M records (COPY)
11.8x
Batch vs Single Inserts

🏆 Winner: PostgreSQL COPY command delivers exceptional performance for bulk data insertion, making it the optimal choice for large-scale data operations.

Performance Analysis Repository

Access the complete source code for all PostgreSQL insertion methods, performance benchmarks, test data generation, and detailed implementation examples used in this analysis.

View Source Code

Key Takeaways

  • COPY command is 31.5x faster than single inserts for bulk operations
  • Batch inserts offer the best balance of performance and simplicity
  • JSONB handling requires proper serialization and type casting
  • Choose the method based on your data volume and performance requirements