5 Methods to Convert xlsx Format Files to CSV on Linux CLI

convert xlsx csv cli

XLSX is a file extension for an open XML spreadsheet file format used by Microsoft excel. Converting Microsoft Excel sheet to a Comma Separated file (CSV) is relatively very easy while using command line. The situation may arrive when you have a XLS file and you need to fill the database from it after formatting the data. It exists some methods in command line in order to do the conversion of the different format files.

1) Gnumeric spreadsheet program

Gnumeric is a spreadsheet program for Unix and Unix-like operating systems distributed under the GNU General Public License. It stores its information by creating files and re-opening these files during a future session. It can import and export spreadsheet data to and from multiple formats, including CSV, Microsoft Excel, HTML, OpenDocument, Quattro Pro, and LaTex.

Gnumeric is not present by default in the repository of your centos 7, you must first install the latest lux-release. First download it

# wget http://repo.iotti.biz/CentOS/7/noarch/lux-release-7-1.noarch.rpm
--2017-10-13 23:32:19-- http://repo.iotti.biz/CentOS/7/noarch/lux-release-7-1.noarch.rpm
Resolving repo.iotti.biz (repo.iotti.biz)...
Connecting to repo.iotti.biz (repo.iotti.biz)||:80... connected.

Now you can install the lux release

# rpm -Uvh lux-release-7-1.noarch.rpm 
warning: lux-release-7-1.noarch.rpm: Header V4 DSA/SHA1 Signature, key ID 53e4e7a9: NOKEYCSV
Preparing... ################################# [100%]
Updating / installing...
 1:lux-release-7-1 ################################# [100%]

With the lux-release installed, we can now install gnumeric via package

# yum install gnumeric
Loaded plugins: fastestmirror, langpacks
lux | 2.9 kB 00:00:00 
lux/7/primary_db | 1.0 MB 00:00:05 
Loading mirror speeds from cached hostfile
 * base: ftp.hosteurope.de
 * epel: mirror.liquidtelecom.com
 * extras: ftp.hosteurope.de
 * updates: ftp.hosteurope.de
Resolving Dependencies
--> Running transaction check
---> Package gnumeric.x86_64 1:1.10.10-2.el7.lux.1 will be installed

Now you can use the ssconvert command of the gnumeric spreadsheet to convert the file

# ssconvert book.xlsx file.csv
Using exporter Gnumeric_stf:stf_csv

You can visualize the file now

# cat file.csv 

2) xlsx2csv converter

xlsx2csv converter is a python application that is capable to convert a batch of XLSX/XLS files to CSV format. You can specify exactly which sheets to be converted. If you have multiple sheets, the xlsx2csv give the possibility to export all the sheets at once, or one at a time.

To install it, you need to have python already installed. Then, you can proceed as below:

# easy_install xlsx2csv
Searching for xlsx2csv
Reading https://pypi.python.org/simple/xlsx2csv/
Best match: xlsx2csv 0.7.3
Downloading https://pypi.python.org/packages/4c/56/4c7f595525839710ab563c8e5a48226021111c1324b1460e603256f7665c/xlsx2csv-0.7.3.tar.gz#md5=b9cffbbe815259987237135f99658c63
Processing xlsx2csv-0.7.3.tar.gz

Now you can convert you xlsx file

# xlsx2csv book.xlsx > convert.csv

You can check the content of the file

# cat convert.csv 

By default, the xlsx2csv command convert only the first sheet even if your file contains multiples sheets. Fortunately, it offes the possibilty to convert all the sheets or to choose the one to convert. You can use some interesting paramaters:

  • -a, --all to export all sheets
  • -d DELIMITER for columns delimiter in csv
  • -p SHEETDELIMITER for sheet delimiter used to separate sheets, pass '' if you do not need delimiter, or 'x07' or '\f' for form feed (default: '--------')
  • -s SHEETID for the sheet number to convert

For example, if you want to convert only a specific sheet

# xlsx2csv class.xlsx -s 2 > sheet2.csv

You can check

# cat sheet2.csv 

Now if you want to convert all the sheet, you can do as below

# xlsx2csv class.xlsx --all > allsheet.csv

You can check the content as below

# cat allsheet.csv 
-------- 1 - Sheet1
-------- 2 - Sheet2
-------- 3 - Sheet3

You can see that the default delimiter helps to know the sheets.

3) csvkit tool

csvkit is a python library optimized for working with CSV files. It is a nice tool to manipulate, organize, analyze and work with data, using the csv format. It is very light and fast. It is used through the terminal with its in2csv command which converts a variety of common file formats, including xls, xlsx and fixed-width into CSV format..

# pip install csvkit
Collecting csvkit
 Using cached csvkit-1.0.2.tar.gz
Collecting agate>=1.6.0 (from csvkit)

Now you can convert as below:

# in2csv Classeur2.xlsx > book3.csv

4) unoconv

OpenOffice comes with the unoconv program to perform format conversions on the command line. It is present by default if openoffice is installed. You can use the manual

# unoconv --help
usage: unoconv [options] file [file2 ..]

Convert from and to any format supported by LibreOffice

unoconv options:
  -c, --connection=string  use a custom connection string
  -d, --doctype=type       specify document type
                             (document, graphics, presentation, spreadsheet)
  -e, --export=name=value  set export filter options
                             eg. -e PageRange=1-2
  -f, --format=format      specify the output format
  -i, --import=string      set import filter option string
                             eg. -i utf8
  -l, --listener           start a permanent listener to use by unoconv clients
  -n, --no-launch          fail if no listener is found (default: launch one)
  -o, --output=name        output basename, filename or directory
      --pipe=name          alternative method of connection using a pipe
  -p, --port=port          specify the port (default: 2002)
                             to be used by client or listener
      --password=string    provide a password to decrypt the document
  -s, --server=server      specify the server address (default:
                             to be used by client or listener
      --show               list the available output formats
      --stdout             write output to stdout
  -t, --template=file      import the styles from template (.ott)
  -T, --timeout=secs       timeout after secs if connection to listener fails
  -v, --verbose            be more and more verbose (-vvv for debugging)

The command is capable to convert between various file formats. by default, it converts in pdf. It means that you should indicate the desired format if you don't want to have a undesired format. So, to convert in csv with the unoconv command, you need to use two main parameters:

  • -f which indicates the request the final format of the output file
  • -o to indicate the name and the path of the converted file
# unoconv -f csv -o class2.csv Classeur2.xlsx

You can check the content

# cat class2.csv 

Note that the second row of our original xlsx file is empty, that is why you have the comma on the second line of the csv file.

5) Libreoffice headless

By starting the LibreOffice software from the command line you can assign various parameters, with which you can influence the performance. It is possible through the headless mode which help you to launch LibreOffice in command line without any graphical interface component. It gives you the possibility to convert file in some formats as you need. So, you can use it to convert xlsx files in csv. You need to use the indicated the final format (csv) with the--convert-to parameter followed by the file to convert as below:

# libreoffice --headless --convert-to csv book.xlsx --outdir conv/
convert /home/admin/Desktop/book.xlsx -> /home/admin/Desktop/conv/book.csv using filter : Text - txt - csv (StarCalc)

Now you can check the file

# cat conv/book.csv 

You can directly convert some xlsx files as below:

# libreoffice --headless --convert-to csv --outdir conv/ *.xlsx
convert /home/admin/Desktop/book.xlsx -> /home/admin/Desktop/conv//book.csv using filter : Text - txt - csv (StarCalc)
convert /home/admin/Desktop/Classeur1.xlsx -> /home/admin/Desktop/conv//Classeur1.csv using filter : Text - txt - csv (StarCalc)
convert /home/admin/Desktop/Classeur2.xlsx -> /home/admin/Desktop/conv//Classeur2.csv using filter : Text - txt - csv (StarCalc)
convert /home/admin/Desktop/class.xlsx -> /home/admin/Desktop/conv//class.csv using filter : Text - txt - csv (StarCalc)

You can look the converted as below

[root@centos7-srv Desktop]# ls conv
book.csv class.csv Classeur1.csv Classeur2.csv

You can check the content of one file

# cat conv/Classeur2.csv 

We have seen the different tools available on Linux to convert any xlsx file format in csv file on command line. You can decide to convert the file in odt or pdf and it is possible with unoconv and libreoffice headless. Worth trying Miller tool which does conversion between formats and more.

11 Comments... add one

  1. Hi ,I have requirment to merge multiple .csv files into excel .
    I am on linux system. cna you please let me know what are the approaches and hte required softwares I need on linux
    so that I can request them to IT team

  2. These are nice but I am curious if one of these can get the color of each row and add a column with the color during the conversion any idea if and how to do this?

  3. Thanks Alain for this precious page.
    You eventually put in the garbage bin my many years in the study of R import !!!
    Just a remark , which may be useful to others: the unoconv method seems the only one that respect the local date format
    Consider the date such as the first day of september, year 2019, as shown in my italian mint box
    It appears as above in the xlsx file - read through libreoffice - and turned out to be the very same on the converted file with unoconv.

    I tried both in2csv and xls2csv, but the date was transformed into
    in both cases.

    Thanks again

  4. Great post thanks!
    Managed to convert xlsx to csv with libreoffice in Cygwin 64 bit on Windows 10.
    In this version the executable name is `soffice`, and `-outdir` is deprecated in favor of `--outdir`.
    Added `.../Program Files/LibreOffice/program` to user path.
    Conversion command is like this:
    `soffice --headless --convert-to csv myfile.xlsx --outdir csv/`
    Before that I tried with xlsx2csv. It installed in Cygwin (with a warning) and silently failed to convert.

  5. Batch convert implies headless so can drop that param.

    Also for macOS users, soffice can be found within the package, i.e.


    as in

    /Applications/LibreOffice.app/Contents/MacOS/soffice --convert-to csv *.xls


Leave a Comment