File size cap when saving to csv?

BuJay

New Member
Joined
Jun 24, 2020
Messages
7
Office Version
365, 2019, 2016, 2013
Platform
Windows
I am seeing really, really strange behavior right now. Would appreciate any and all thoughts.

I created an excel file with 2500 columns and 100,000 rows and saved it as a csv to practice loading large csv files into Python using Pandas. I was able to load the 100,000 row file with pandas successfully.

Then, I simply copied the 100,000 rows (excluding headers) and pasted in rows 100,001 through 200,000 to create a 200,000 row csv file. I was able to load the 200,000 row file with pandas successfully.

Then, I simply added another 100,000 rows (excluding headers) and pasted in rows 200,001 through 300,000 to create a 300,000 row csv file. I was able to load the 300,000 row file with pandas successfully.

Here is where is gets strange. The size of the 300,000 row csv is 4.19 GB.

When I open that file and add another 100,000 rows and save as a 400,000 row csv file, the file size remains 4.19 GB and something is corrupting the csv file as its structure appears to change and I cannot load it successfully.

I am deducing that something is corrupting it during save process. Any thoughts?

As an aside, I know there isn't any real reason to use this process for large files - I get that. I am still curious as to what is going on. Also, it is not a python nor pandas issue.

Thanks
 

Some videos you may like

Excel Facts

Excel Joke
Why can't spreadsheets drive cars? They crash too often!

BuJay

New Member
Joined
Jun 24, 2020
Messages
7
Office Version
365, 2019, 2016, 2013
Platform
Windows
Any help or thoughts?
 

JonXL

Active Member
Joined
Feb 5, 2018
Messages
354
Office Version
365, 2016
Platform
Windows
No advice, just a question: When you open the 'corrupt' CSV in a text editor, what exactly is wrong with its structure, etc.?
 

BuJay

New Member
Joined
Jun 24, 2020
Messages
7
Office Version
365, 2019, 2016, 2013
Platform
Windows
It's strange - I can run the "error_bad_lines" command but it throws basically every row as an error after the initial error row....the whole point is that I can't really open the .csv's in excel to look at the rows or when I do, everything looks fine.....
 

jay_py

New Member
Joined
Mar 29, 2020
Messages
7
Office Version
2016, 2013
Platform
Windows
Since you are only reading the csv using Python, it's not the cause of the problem.

A .csv file is basically similar to a .txt file, which means you can open a .csv with any text editor. Try to open your large file with a text editor, look through the columns and rows - are there anything extra that shouldn't be there? I suspect maybe you copy/pasted something extra...
 

Subscribe on YouTube

Watch MrExcel Video

Forum statistics

Threads
1,106,015
Messages
5,508,814
Members
408,695
Latest member
MarcelCohen

This Week's Hot Topics

  • Turn fraction around
    Hello I need to turn a fraction around, for example I have 1/3 but I need to present as 3/1
  • TIme Clock record reformatting to ???
    Hello All, I'd like some help formatting this (Tbl-A)(Loaded via Power Query) [ATTACH type="full" width="511px" alt="PQdata.png"]22252[/ATTACH]...
  • TextBox Match
    hi, I am having a few issues with my code below, what I need it to do is when they enter a value in textbox8 (QTY) either 1,2 or 3 the 3 textboxes...
  • Using Large function based on Multiple Criteria
    Hello, I can't seem to get a Large formula to work based on two criteria's. I can easily get a oldest value based one value, but I'm struggling...
  • Can you check my code please
    Hi, Im going round in circles with a Compil Error End With Without With Here is the code [CODE=rich] Private Sub...
  • Combining 2 pivot tables into 1 chart
    Hello everyone, My question sounds simple but I do not know the answer. I have 2 pivot tables and 2 charts that go with this. However I want to...
Top