Convert xlsx to csv in Linux with command line
Extracto
I'm looking for a way to convert xlsx files to csv files on Linux. I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something
Contenido
I'm looking for a way to convert xlsx files to csv files on Linux.
I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I'm currently using) but I need support for the newer Excel files.
Any ideas?
8
The Gnumeric spreadsheet application comes with a command line utility called ssconvert that can convert between a variety of spreadsheet formats:
$ ssconvert Book1.xlsx newfile.csv
Using exporter Gnumeric_stf:stf_csv
$ cat newfile.csv
Foo,Bar,Baz
1,2,3
123.6,7.89,
2012/05/14,,
The,last,Line
To install on Ubuntu:
apt-get install gnumeric
To install on Mac:
brew install gnumeric
13
You can do this with LibreOffice:
libreoffice --headless --convert-to csv $filename --outdir $outdir
For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:
users ALL=(ALL) NOPASSWD: libreoffice
14
If you already have a Desktop environment then I'm sure Gnumeric / LibreOffice would work well, but on a headless server (such as Amazon Web Services), they require dozens of dependencies that you also need to install.
I found this Python alternative:
https://github.com/dilshod/xlsx2csv
$ easy_install xlsx2csv
$ xlsx2csv file.xlsx > newfile.csv
Took 2 seconds to install and works like a charm.
If you have multiple sheets you can export all at once, or one at a time:
$ xlsx2csv file.xlsx --all > all.csv
$ xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv
$ xlsx2csv file.xlsx -s 1 > sheet1.csv
He also links to several alternatives built in Bash, Python, Ruby, and Java.
answered Feb 14 '14 at 18:34
andrewtweberandrewtweber
22.1k21 gold badges79 silver badges106 bronze badges
5
In bash, I used this libreoffice command to convert all my xlsx files in the current directory:
for i in *.xlsx; do libreoffice --headless --convert-to csv "$i" ; done
Close all your Libre Office open instances before executing, or it will fail silently.
The command takes care of spaces in the filename.
Tried again some years later, and it didn't work. This thread gives some tips, but the quickest solution was to run as root (or running a sudo libreoffice). Not elegant, but quick.
Use the command scalc.exe in Windows
4
answered Nov 6 '14 at 9:10
Holger BrandlHolger Brandl
8,7541 gold badge57 silver badges56 bronze badges
Another option would be to use R via a small bash wrapper for convenience:
xlsx2txt(){
echo '
require(xlsx)
write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep="\t")
' | Rscript --vanilla - $1 2>/dev/null
}
xlsx2txt file.xlsx > file.txt
answered Sep 2 '14 at 15:03
Holger BrandlHolger Brandl
8,7541 gold badge57 silver badges56 bronze badges
If .xlsx file has many sheets, -s flag can be used to get the sheet you want. For example:
xlsx2csv "my_file.xlsx" -s 2 second_sheet.csv
second_sheet.csv would contain data of 2nd sheet in my_file.xlsx.
Using the Gnumeric spreadsheet application which comes which a commandline utility called ssconvert is indeed super simple:
find . -name '*.xlsx' -exec ssconvert -T Gnumeric_stf:stf_csv {} \;
and you're done!
2
If you are OK to run Java command line then you can do it with Apache POI HSSF's Excel Extractor. It has a main method that says to be the command line extractor. This one seems to just dump everything out. They point out to this example that converts to CSV. You would have to compile it before you can run it but it too has a main method so you should not have to do much coding per se to make it work.
Another option that might fly but will require some work on the other end is to make your Excel files come to you as Excel XML Data or XML Spreadsheet of whatever MS calls that format these days. It will open a whole new world of opportunities for you to slice and dice it the way you want.
answered May 11 '12 at 19:42
Pavel VellerPavel Veller
5,9331 gold badge23 silver badges24 bronze badges
1
As others said, libreoffice can convert xls files to csv. The problem for me was the sheet selection.
This libreoffice Python script does a fine job at converting a single sheet to CSV.
Usage is:
./libreconverter.py File.xls:"Sheet Name" output.csv
The only downside (on my end) is that --headless doesn't seem to work. I have a LO window that shows up for a second and then quits.
That's OK with me, it's the only tool that does the job rapidly.
answered Dec 16 '16 at 10:22
Benoit DuffezBenoit Duffez
10.4k12 gold badges72 silver badges116 bronze badges
Not the answer you're looking for? Browse other questions tagged linux excel csv converter xlsx or ask your own question.
Fuente: Stack Overflow