How to Split a Large Excel File into Smaller Files

Problem description

There’s an orders file orders.xlsx that has over a million rows. Below is part of the file:

Task: split the file into smaller files according to two methods:

1. Split by category, such as storing orders data of each state as a file.

2. Split by row, such as storing every ten thousand rows as a file.

Implementation directions:

1. Start esProc

Download esProc installation package and free DSK edition license HERE. You will be prompted to load the license file when you run esProc for the first time.

2. Write a script:

(1) Split by category

It’s more convenient to view the code in this way:

A1 Define the directory that will store the future files.

A2 Use @c option to open orders.xlsx with cursor since it’s probably the file cannot fit into the memory. @t option enables importing the first row as column headers.

A3 Retrieve rows from A2’s cursor, with 50000 rows at a time.

B3 Group the current batch of rows by state.

B4 Handle each group (rows of state) circularly.

C4 Define files for storing the groups and name them after states.

C5D5 If a namesake file already exists, use @a option to append the current batch of rows to it.

C6D6 If there isn’t a namesake file, use @t option to write the current batch of rows to it, with the first row being the column headers.

(2)Split by row

esProc script:

A1 Define the directory that will store the future files.

A2 Use @c option to open orders.xlsx with cursor since it’s probably the file cannot fit into the memory. @t option enables importing the first row as column headers.

A3 Retrieve rows from A2’s cursor, with 100000 rows at a time.

B3 Store the current batch of rows into a new file. #A3 represents the loop number in A3.

3. Press F9 to run the program. Then you can check the expected smaller files in E:/orders directory.

【 Reference 】 split.zip

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Working with constraints in IOS Animations…

what I have been waiting for has come !

Introduction to Apache Spark

Continuous Integration with Apigee and Github Actions

How to Handle Huge XLS Files

Read a Barcode

The Weekly Squeak — Software error tracking with Brian Rue of Rollbar

Maxresdefault

SDK vs API — Why do those two tools usually go hand in hand

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
chuxin huo

chuxin huo

hello:)

More from Medium

MENDIX Adding Validation Rules efficiently using rules

Home Loan Status Prediction Using Logistic Regression

Credit Card Application Risk Scoring and Approvement Prediction

Inverse polynomial linear units “IplU” Activation Function