Using JDBC and Postgres Copy API to bulk load data into the Postgres database
Introduction
- COPY command copies data between file and table, while COPY TO copies data from table to file, and COPY FROM does the opposite.
- Copy command is very efficient when we want to insert bulk data into the table.
- We have covered bulk insert data in previous article, in this article we cover bulk insert using the COPY command.
Table
- We have a book_reviews table that records the review from different users for different books.
postgres=# \d book_reviews
Table "public.book_reviews"
Column | Type | Collation | Nullable | Default
-------------+---------+-----------+----------+--------------------------------------------------
reviews_id | integer | | not null | nextval('book_reviews_reviews_id_seq'::regclass)
user_id | text | | |
isbn | text | | |
book_rating | text | | |
Indexes:
"book_reviews_pkey" PRIMARY KEY, btree (reviews_id)
- Total records in book_reviews table.
postgres=# select count(*) from book_reviews;
count
---------
2469085
(1 row)
Business Logic
Building Connection Object
- At first, we will initialize the connection object.
private static Connection getConnection() throws SQLException {
Connection connection = DriverManager.getConnection("jdbc:postgresql://localhost/postgres",
USERNAME, PASSWORD);
if(connection != null){
System.out.println("Connected to database");
}
connection.setAutoCommit(false);
return connection;
}
Copy Logic
- Postgres driver provides Copy API that we can use to execute copy command that takes read input from STDIN.
- We can copy values either from a file or from in-memory objects/entities.
From Input File
- First, we need to get the CopyManager instance, which will provide copyIn method. The CopyIn method takes copy command as the first parameter and InputStream as the second parameter.
- We can open InputStream from the input file.
private static void copy() throws SQLException, IOException {
Connection connection = getConnection();
CopyManager copyManager = new CopyManager((BaseConnection) connection);
// file input stream
InputStream inputStream = Files.newInputStream(Path.of("/Users/surajmishra/Desktop/Book-Ratings.csv"));
long l = copyManager
.copyIn("COPY book_reviews(user_id, isbn, book_rating) FROM STDIN delimiter ';' csv header encoding 'UTF-8'",
inputStream);
connection.commit();
connection.close();
}
- We ingested almost 3M records.
postgres=# select count(*) from book_reviews;
count
---------
2962898
(1 row)
From In Memory Object
- We can also build a book_review object
private static BookReview getBookReview() {
BookReview bookReview = new BookReview();
String seed = UUID.randomUUID().toString();
bookReview.setIsbn("isbn"+seed);
bookReview.setBookRating("2.3");
bookReview.setUserId(seed);
return bookReview;
}
- And convert to csv record since we are reading input as CSV in COPY FROM command.
private static String toCsv(BookReview a) {
return a.getUserId() + ", " +
a.getIsbn() + ", " +
a.getBookRating();
}
- Let’s create few book_review objects and convert them to CSV.
- Now we need to convert the String of csvRows to InputStream since copyIn method takes InputStream as parameter.
- We can convert String to bytes and construct ByteArrayInputStream which extends InputStream.
public static void copy2() throws SQLException, IOException {
List<BookReview> bookReviews = IntStream.rangeClosed(1, 10)
.mapToObj(i -> getBookReview())
.toList();
String csvRows = bookReviews.stream()
.map(LoadDataWithCopy::toCsv)
.collect(Collectors.joining("\n"));
InputStream is = new ByteArrayInputStream(csvRows.getBytes());
Connection connection = getConnection();
CopyManager copyManager = new CopyManager((BaseConnection) connection);
long l = copyManager.copyIn("COPY book_reviews(user_id, isbn, book_rating) FROM STDIN delimiter ',' csv encoding 'UTF-8'",
is);
}
Conclusion
- Copy Command is very handy when we need to insert/load data in bulk.
- Postgres provides Copy API that helps to execute copy commands.
- COPY is usually faster than multi values insert command.
Before You Leave
- If you like this content consider becoming medium member to support.
- Upgrade your Java skills with Grokking the Java Interview
- If you want to upskill your Java skills, you should definitely check out
[NEW] Master Spring Boot 3 & Spring Framework 6 with Java
[ 38 hrs content, 4.7/5 stars, 6+ students already enrolled]