(PHP 5 >= 5.3.0)
str_getcsv — 解析 CSV 字符串为一个数组
$input
[, string $delimiter
= ','
[, string $enclosure
= '"'
[, string $escape
= '\\'
]]] )
input
待解析的字符串。
delimiter
设定字段界定符(仅单个字符)。
enclosure
设定字段包裹字符(仅单个字符)。
escape
设置转义字符(仅单个字符)。默认为反斜线(\)。
返回一个包含读取到的字段的索引数组。
normadize -a- gmail -d- com (2013-03-13 23:19:04)
Like some other users here noted, str_getcsv() cannot be used if you want to comply with either the RFC or with most spreadsheet tools like Excel or Google Docs.
These tools do not escape commas or new lines, but instead place double-quotes (") around the field. If there are any double-quotes in the field, these are escaped with another double-quote (" becomes ""). All this may look odd, but it is what the RFC and most tools do ...
For instance, try exporting as .csv a Google Docs spreadsheet (File > Download as > .csv) which has new lines and commas as part of the field values and see how the .csv content looks, then try to parse it using str_getcsv() ... it will spectacularly regardless of the arguments you pass to it.
Here is a function that can handle everything correctly, and more:
- doesn't use any for or while loops,
- it allows for any separator (any string of any length),
- option to skip empty lines,
- option to trim fields,
- can handle UTF8 data too (although .csv files are likely non-unicode).
Here is the more human readable version of the function:
<?php
// returns a two-dimensional array or rows and fields
function parse_csv ($csv_string, $delimiter = ",", $skip_empty_lines = true, $trim_fields = true)
{
$enc = preg_replace('/(?<!")""/', '!!Q!!', $csv_string);
$enc = preg_replace_callback(
'/"(.*?)"/s',
function ($field) {
return urlencode(utf8_encode($field[1]));
},
$enc
);
$lines = preg_split($skip_empty_lines ? ($trim_fields ? '/( *\R)+/s' : '/\R+/s') : '/\R/s', $enc);
return array_map(
function ($line) use ($delimiter, $trim_fields) {
$fields = $trim_fields ? array_map('trim', explode($delimiter, $line)) : explode($delimiter, $line);
return array_map(
function ($field) {
return str_replace('!!Q!!', '"', utf8_decode(urldecode($field)));
},
$fields
);
},
$lines
);
}
?>
Since this is not using any loops, you can actually write it as a one-line statement (one-liner).
Here's the function using just one line of code for the function body, formatted nicely though:
<?php
// returns the same two-dimensional array as above, but with a one-liner code
function parse_csv ($csv_string, $delimiter = ",", $skip_empty_lines = true, $trim_fields = true)
{
return array_map(
function ($line) use ($delimiter, $trim_fields) {
return array_map(
function ($field) {
return str_replace('!!Q!!', '"', utf8_decode(urldecode($field)));
},
$trim_fields ? array_map('trim', explode($delimiter, $line)) : explode($delimiter, $line)
);
},
preg_split(
$skip_empty_lines ? ($trim_fields ? '/( *\R)+/s' : '/\R+/s') : '/\R/s',
preg_replace_callback(
'/"(.*?)"/s',
function ($field) {
return urlencode(utf8_encode($field[1]));
},
$enc = preg_replace('/(?<!")""/', '!!Q!!', $csv_string)
)
)
);
}
?>
Replace !!Q!! with another placeholder if you wish.
Have fun.
V.Krishn (2013-03-06 01:31:00)
<?php
Note: The function trims all values unlike str_getcsv (v5.3).
/**
* @link https://github.com/insteps/phputils (for updated code)
* Parse a CSV string into an array for php 4+.
* @param string $input String
* @param string $delimiter String
* @param string $enclosure String
* @return array
*/
function str_getcsv4($input, $delimiter = ',', $enclosure = '"') {
if( ! preg_match("/[$enclosure]/", $input) ) {
return (array)preg_replace(array("/^\\s*/", "/\\s*$/"), '', explode($delimiter, $input));
}
$token = "##"; $token2 = "::";
//alternate tokens "\034\034", "\035\035", "%%";
$t1 = preg_replace(array("/\\\[$enclosure]/", "/$enclosure{2}/",
"/[$enclosure]\\s*[$delimiter]\\s*[$enclosure]\\s*/", "/\\s*[$enclosure]\\s*/"),
array($token2, $token2, $token, $token), trim(trim(trim($input), $enclosure)));
$a = explode($token, $t1);
foreach($a as $k=>$v) {
if ( preg_match("/^{$delimiter}/", $v) || preg_match("/{$delimiter}$/", $v) ) {
$a[$k] = trim($v, $delimiter); $a[$k] = preg_replace("/$delimiter/", "$token", $a[$k]); }
}
$a = explode($token, implode($token, $a));
return (array)preg_replace(array("/^\\s/", "/\\s$/", "/$token2/"), array('', '', $enclosure), $a);
}
if ( ! function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ',', $enclosure = '"') {
return str_getcsv4($input, $delimiter, $enclosure);
}
}
?>
V.Krishn (2013-03-06 01:20:46)
Note: The function trims all values unlike str_getcsv (v5.3).
/**
* @link https://github.com/insteps/phputils (for updated code)
* Parse a CSV string into an array for php 4+.
* @param string $input String
* @param string $delimiter String
* @param string $enclosure String
* @return array
*/
function str_getcsv4($input, $delimiter = ',', $enclosure = '"') {
if( ! preg_match("/[$enclosure]/", $input) ) {
return (array)preg_replace(array("/^\\s*/", "/\\s*$/"), '', explode($delimiter, $input));
}
$token = "##"; $token2 = "::";
//alternate tokens "\034\034", "\035\035", "%%";
$t1 = preg_replace(array("/\\\[$enclosure]/", "/$enclosure{2}/",
"/[$enclosure]\\s*[$delimiter]\\s*[$enclosure]\\s*/", "/\\s*[$enclosure]\\s*/"),
array($token2, $token2, $token, $token), trim(trim(trim($input), $enclosure)));
$a = explode($token, $t1);
foreach($a as $k=>$v) {
if ( preg_match("/^{$delimiter}/", $v) || preg_match("/{$delimiter}$/", $v) ) {
$a[$k] = trim($v, $delimiter); $a[$k] = preg_replace("/$delimiter/", "$token", $a[$k]); }
}
$a = explode($token, implode($token, $a));
return (array)preg_replace(array("/^\\s/", "/\\s$/", "/$token2/"), array('', '', $enclosure), $a);
}
if ( ! function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ',', $enclosure = '"') {
return str_getcsv4($input, $delimiter, $enclosure);
}
}
khelibert at gmail dot com (2012-09-04 16:42:46)
I've written this to handle :
- fields with or without enclosure;
- escape and enclosure characters using the same character (ie <<">> in Excel)
<?php
/**
* Converts a csv file into an array of lines and columns.
* khelibert@gmail.com
* @param $fileContent String
* @param string $escape String
* @param string $enclosure String
* @param string $delimiter String
* @return array
*/
function csvToArray($fileContent,$escape = '\\', $enclosure = '"', $delimiter = ';')
{
$lines = array();
$fields = array();
if($escape == $enclosure)
{
$escape = '\\';
$fileContent = str_replace(array('\\',$enclosure.$enclosure,"\r\n","\r"),
array('\\\\',$escape.$enclosure,"\\n","\\n"),$fileContent);
}
else
$fileContent = str_replace(array("\r\n","\r"),array("\\n","\\n"),$fileContent);
$nb = strlen($fileContent);
$field = '';
$inEnclosure = false;
$previous = '';
for($i = 0;$i<$nb; $i++)
{
$c = $fileContent[$i];
if($c === $enclosure)
{
if($previous !== $escape)
$inEnclosure ^= true;
else
$field .= $enclosure;
}
else if($c === $escape)
{
$next = $fileContent[$i+1];
if($next != $enclosure && $next != $escape)
$field .= $escape;
}
else if($c === $delimiter)
{
if($inEnclosure)
$field .= $delimiter;
else
{
//end of the field
$fields[] = $field;
$field = '';
}
}
else if($c === "\n")
{
$fields[] = $field;
$field = '';
$lines[] = $fields;
$fields = array();
}
else
$field .= $c;
$previous = $c;
}
//we add the last element
if(true || $field !== '')
{
$fields[] = $field;
$lines[] = $fields;
}
return $lines;
}
?>
xoneca at gmail dot com (2011-11-27 04:26:03)
Note that this function can also be used to parse other types of constructions. For example, I have used to parse .htaccess AddDescription lines:
AddDescription "My description to the file." filename.jpg
Those lines can be parsed like this:
<?php
$line = 'AddDescription "My description to the file." filename.jpg';
$parsed = str_getcsv(
$line, # Input line
' ', # Delimiter
'"', # Enclosure
'\\' # Escape char
);
var_dump( $parsed );
?>
The output:
array(3) {
[0]=>
string(14) "AddDescription"
[1]=>
string(27) "My description to the file."
[2]=>
string(12) "filename.jpg"
}
ioplex at gmail dot com (2011-09-29 20:38:24)
Note that the CSV format handled by this function is not even close to being the same as the CSV emitted and consumed by spreadsheet programs. In truth, one could argue that the real CSV "specification" is actually just the behavior of Microsoft Excel and not an RFC that was written 20 years after Excel used CSV (although thankfully RFC 4180 is actually the same as Excel behavior with the exception that Excel actually uses a tab as a separator by default).
CSV used by spreadsheet programs like Excel do not use escape characters to escape stray separator characters. Again, the default separator character is actually a tab and not a comma (but historically this is usually still referred to it as CSV). Excel uses quotes to enclose an element that has literal separator characters in it (and an extra quote to escape each literal quote within the quotes - strange but true). Another quirk is that Excel will not trim extra space around elements including line breaks. Meaning you can have multiple lines of text within an element by just putting it in quotes. See the Wikipedia entry for "Comma-separated values" for good examples that have quotes and escaped quotes and newlines and so on.
Note that if you want to generate data that Excel will load into cells when someone clicks on a link, use tab as a separator, quote elements that have tabs or quotes and escape literal quotes with an extra preceding quote.
The following routine can be used to properly quote an element and escape quotes with an extra quote in the way Excel expects:
<?php
function escapeCsvElement($str, $sep) {
$quot = false;
$si = 0;
$slen = strlen($str);
$ret = '';
while ($si < $slen) {
$ch = $str[$si];
if ($ch == $sep)
$quot = true;
if ($ch == '"') {
$quot = true;
$ret .= '"';
}
$ret .= $ch;
$si++;
}
if ($quot)
return '"' . $ret . '"';
return $str;
}
?>
Then write each row to the browser terminated by carriage-return / newline linebreaks and send headers like:
<?php
header('Content-type: application/csv');
header('Content-Disposition: attachment; filename=' . $filename);
?>
This is what Excel and other spreadsheet programs expect to automatically load your cell data as expected without triggering the file import wizard.
As for parsing real CSV, the *getcsv functions are not going to work. To parse the quotes and escaped quotes and literal line breaks and so on, that would be handled best with a proper state-machine parser.
w_barath@hotmail,com (2011-08-02 19:19:28)
For people who haven't got access to str_getcsv() in their version of PHP and want to convert strings from CSV, I have seen many huge swaths of code and regexes to do this extremely simple task. Here's minimal, fast code:
<?php
function csv2array($input,$delimiter=',',$enclosure='"',$escape='\\'){
$fields=explode($enclosure.$delimiter.$enclosure,substr($input,1,-1));
foreach ($fields as $key=>$value)
$fields[$key]=str_replace($escape.$enclosure,$enclosure,$value);
return($fields);
}
function array2csv($input,$delimiter=',',$enclosure='"',$escape='\\'){
foreach ($input as $key=>$value)
$input[$key]=str_replace($enclosure,$escape.$enclosure,$value);
return $enclosure.implode($enclosure.$delimiter.$enclosure,$input).$enclosure;
}
$data=array("one","\"two\"");
for ($i=0;$i<100;$i++) $data=array(array2csv($data),$i);
echo "Encoded $i times...".var_export($data,true)."\n";
for ($j=0;$j<$i;$j++) $data=csv2array($data[0]);
echo "Decoded $i times...".var_export($data,true)."\n";
?>
Pretty straightforward. ;-)
Yes, I intentionally changed the function names,
M. Bunge (TLC Communications, DE) (2011-06-22 17:08:08)
Just another way to convert a csv file to an associative array.
<?php
//
// Convert csv file to associative array:
//
function csv_to_array($input, $delimiter='|')
{
$header = null;
$data = array();
$csvData = str_getcsv($input, "\n");
foreach($csvData as $csvLine){
if(is_null($header)) $header = explode($delimiter, $csvLine);
else{
$items = explode($delimiter, $csvLine);
for($n = 0, $m = count($header); $n < $m; $n++){
$prepareData[$header[$n]] = $items[$n];
}
$data[] = $prepareData;
}
}
return $data;
}
//-----------------------------------
//
//Usage:
$csvArr = csv_to_array(file_get_contents('test.csv'));
?>
durik at 3ilab dot net (2011-01-16 11:39:57)
As the str_getcsv(), unlike to fgetcsv(), does not parse the rows in CSV string, I have found following easy workaround:
<?php
$Data = str_getcsv($CsvString, "\n"); //parse the rows
foreach($Data as &$Row) $Row = str_getcsv($Row, ";"); //parse the items in rows
?>
Why not use explode() instead of str_getcsv() to parse rows? Because explode() would not treat possible enclosured parts of string or escaped characters correctly.
Anonymous (2010-10-25 00:25:22)
If your version of PHP doesn't have `str_getcsv` and you don't need custom $escape or $eol values, try this:
<?php if (!function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter=',', $enclosure='"', $escape=null, $eol=null) {
$temp=fopen("php://memory", "rw");
fwrite($temp, $input);
fseek($temp, 0);
$r = array();
while (($data = fgetcsv($temp, 4096, $delimiter, $enclosure)) !== false) {
$r[] = $data;
}
fclose($temp);
return $r;
}
} ?>
[EDIT BY danbrown AT php DOT net: Contains a bugfix provided by (depely AT IAMNOTABOT prestaconcept.net) on 04-MAR-2011 with the following note: "The previous anonymous function only read the first line".]
Jay Williams (2010-08-10 13:50:05)
Here is a quick and easy way to convert a CSV file to an associated array:
<?php
/**
* @link http://gist.github.com/385876
*/
function csv_to_array($filename='', $delimiter=',')
{
if(!file_exists($filename) || !is_readable($filename))
return FALSE;
$header = NULL;
$data = array();
if (($handle = fopen($filename, 'r')) !== FALSE)
{
while (($row = fgetcsv($handle, 1000, $delimiter)) !== FALSE)
{
if(!$header)
$header = $row;
else
$data[] = array_combine($header, $row);
}
fclose($handle);
}
return $data;
}
?>
hpartidas at deuz dot net (2010-05-25 09:50:49)
I found myself wanting to parse a CSV and didn't have access to str_getcsv, so I wrote substitute for PHP < 5.3, hope it helps someone out there stuck in the same situation.
<?php
if (!function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ',', $enclosure = '"', $escape = '\\', $eol = '\n') {
if (is_string($input) && !empty($input)) {
$output = array();
$tmp = preg_split("/".$eol."/",$input);
if (is_array($tmp) && !empty($tmp)) {
while (list($line_num, $line) = each($tmp)) {
if (preg_match("/".$escape.$enclosure."/",$line)) {
while ($strlen = strlen($line)) {
$pos_delimiter = strpos($line,$delimiter);
$pos_enclosure_start = strpos($line,$enclosure);
if (
is_int($pos_delimiter) && is_int($pos_enclosure_start)
&& ($pos_enclosure_start < $pos_delimiter)
) {
$enclosed_str = substr($line,1);
$pos_enclosure_end = strpos($enclosed_str,$enclosure);
$enclosed_str = substr($enclosed_str,0,$pos_enclosure_end);
$output[$line_num][] = $enclosed_str;
$offset = $pos_enclosure_end+3;
} else {
if (empty($pos_delimiter) && empty($pos_enclosure_start)) {
$output[$line_num][] = substr($line,0);
$offset = strlen($line);
} else {
$output[$line_num][] = substr($line,0,$pos_delimiter);
$offset = (
!empty($pos_enclosure_start)
&& ($pos_enclosure_start < $pos_delimiter)
)
?$pos_enclosure_start
:$pos_delimiter+1;
}
}
$line = substr($line,$offset);
}
} else {
$line = preg_split("/".$delimiter."/",$line);
/*
* Validating against pesky extra line breaks creating false rows.
*/
if (is_array($line) && !empty($line[0])) {
$output[$line_num] = $line;
}
}
}
return $output;
} else {
return false;
}
} else {
return false;
}
}
}
?>
vincent dot enjalbert at gmail dot com (2010-02-19 03:37:25)
For multiline array, just use :
<?php
$csvData = file_get_contents($fileName);
$csvNumColumns = 22;
$csvDelim = ";";
$data = array_chunk(str_getcsv($csvData, $csvDelim), $csvNumColumns);
?>
Raymond (2009-12-14 17:49:06)
Here's a little function to convert a multi-line CSV string to an array:
<?php
function csv_to_array($csv, $delimiter = ',', $enclosure = '"', $escape = '\\', $terminator = "\n") {
$r = array();
$rows = explode($terminator,trim($csv));
$names = array_shift($rows);
$names = str_getcsv($names,$delimiter,$enclosure,$escape);
$nc = count($names);
foreach ($rows as $row) {
if (trim($row)) {
$values = str_getcsv($row,$delimiter,$enclosure,$escape);
if (!$values) $values = array_fill(0,$nc,null);
$r[] = array_combine($names,$values);
}
}
return $r;
}
?>
Jrg Wagner (2009-10-12 14:23:00)
Here is a very concise replacement for str_getcsv.
No escaping of the enclosure char though, but an additional possibility to preserve the enclosing characters around a field.
Note that the fourth parameter therefore has a different meaning!
<?php
function csv_explode($delim=',', $str, $enclose='"', $preserve=false){
$resArr = array();
$n = 0;
$expEncArr = explode($enclose, $str);
foreach($expEncArr as $EncItem){
if($n++%2){
array_push($resArr, array_pop($resArr) . ($preserve?$enclose:'') . $EncItem.($preserve?$enclose:''));
}else{
$expDelArr = explode($delim, $EncItem);
array_push($resArr, array_pop($resArr) . array_shift($expDelArr));
$resArr = array_merge($resArr, $expDelArr);
}
}
return $resArr;
}
?>
dave_walter at yahoo dot com (2009-06-04 09:51:20)
Just to clarify, my str_putcsv() function was only ever designed to complement the functionality of the str_getcsv() built-in function, which can only handle converting one line of input into a single level array. For example, this code:
<?php
var_dump( str_getcsv( "a,b,c\nd,e,f", "," ));
?>
generates this output:
array(5) {
[0]=>
string(1) "a"
[1]=>
string(1) "b"
[2]=>
string(3) "c
d"
[3]=>
string(1) "e"
[4]=>
string(1) "f"
}
Even fgetcsv() and fputcsv() only work with a single line. All the examples show them being used within a loop of some sort.
I was also avoiding the artificial restriction on the length of the CSV string introduced by Ulf's modification.
Ulf (2009-05-28 06:24:09)
As Dave's function also had the problem with only one line being returned here's a slightly changed version:
<?php
function str_putcsv($input, $delimiter = ',', $enclosure = '"') {
// Open a memory "file" for read/write...
$fp = fopen('php://temp', 'r+');
// ... write the $input array to the "file" using fputcsv()...
fputcsv($fp, $input, $delimiter, $enclosure);
// ... rewind the "file" so we can read what we just wrote...
rewind($fp);
// ... read the entire line into a variable...
$data = fread($fp, 1048576); // [changed]
// ... close the "file"...
fclose($fp);
// ... and return the $data to the caller, with the trailing newline from fgets() removed.
return rtrim( $data, "\n" );
}
?>
It assumes that one line won't exceed 1Mb of data. That should be more than enough.
Anonymous (2009-03-16 22:02:16)
For some reason o'connor's code only reads one line of a csv for me... I had to replace the line
$data = fgetcsv($fp, 1000, $delimiter, $enclosure); // $escape only got added in 5.3.0
with this:
$data;
while (!feof($fp))
{
$data[] = fgetcsv($fp, 0, $delimiter, $enclosure); // $escape only got added in 5.3.0
}
...to get all of the data out of my string (some post data pasted into a textbox and processed only with stripslashes).
dave_walter at NOSPAM dot yahoo dot com (2009-02-06 15:32:20)
Drawing inspiration from daniel dot oconnor at gmail dot com, here's an alternative str_putcsv() that leverages existing PHP core functionality (5.1.0+) to avoid re-inventing the wheel.
<?php
if(!function_exists('str_putcsv')) {
function str_putcsv($input, $delimiter = ',', $enclosure = '"') {
// Open a memory "file" for read/write...
$fp = fopen('php://temp', 'r+');
// ... write the $input array to the "file" using fputcsv()...
fputcsv($fp, $input, $delimiter, $enclosure);
// ... rewind the "file" so we can read what we just wrote...
rewind($fp);
// ... read the entire line into a variable...
$data = fgets($fp);
// ... close the "file"...
fclose($fp);
// ... and return the $data to the caller, with the trailing newline from fgets() removed.
return rtrim( $data, "\n" );
}
}
?>
Jeremy (2009-01-20 17:20:46)
After using several methods in the past to create CSV strings without using files (disk IO sucks), I finally decided it's time to write a function to handle it all. This function could use some cleanup, and the variable type test might be overkill for what is needed, I haven't thought about it too much.
Also, I took the liberty of replacing fields with certain data types with strings which I find much easier to work with. Some of you may not agree with those. Also, please note that the type "double" or float has been coded specifically for two digit precision because if I am using a float, it's most likely for currency.
I am sure some of you out there would appreciate this function.
<?php
function str_putcsv($array, $delimiter = ',', $enclosure = '"', $terminator = "\n") {
# First convert associative array to numeric indexed array
foreach ($array as $key => $value) $workArray[] = $value;
$returnString = ''; # Initialize return string
$arraySize = count($workArray); # Get size of array
for ($i=0; $i<$arraySize; $i++) {
# Nested array, process nest item
if (is_array($workArray[$i])) {
$returnString .= str_putcsv($workArray[$i], $delimiter, $enclosure, $terminator);
} else {
switch (gettype($workArray[$i])) {
# Manually set some strings
case "NULL": $_spFormat = ''; break;
case "boolean": $_spFormat = ($workArray[$i] == true) ? 'true': 'false'; break;
# Make sure sprintf has a good datatype to work with
case "integer": $_spFormat = '%i'; break;
case "double": $_spFormat = '%0.2f'; break;
case "string": $_spFormat = '%s'; break;
# Unknown or invalid items for a csv - note: the datatype of array is already handled above, assuming the data is nested
case "object":
case "resource":
default: $_spFormat = ''; break;
}
$returnString .= sprintf('%2$s'.$_spFormat.'%2$s', $workArray[$i], $enclosure);
$returnString .= ($i < ($arraySize-1)) ? $delimiter : $terminator;
}
}
# Done the workload, return the output information
return $returnString;
}
?>
daniel dot oconnor at gmail dot com (2009-01-19 16:48:15)
Don't have this? Ask fgetcsv() to do it for you.
5.1.0+
<?php
if (!function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ",", $enclosure = '"', $escape = "\\") {
$fiveMBs = 5 * 1024 * 1024;
$fp = fopen("php://temp/maxmemory:$fiveMBs", 'r+');
fputs($fp, $input);
rewind($fp);
$data = fgetcsv($fp, 1000, $delimiter, $enclosure); // $escape only got added in 5.3.0
fclose($fp);
return $data;
}
}
?>
Anonymous (2009-01-12 08:55:43)
Thanks Rob, have used your code. It needs a minor tweak so that the delimiter option is processed as it should be. Line 5 should read:
<?php $expr="/$delimiter(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/"; // added ?>
instead of:
<?php $expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/"; // added ?>
JS, London
Rob (2008-11-07 08:54:04)
This does the same as the example below, giving you an array, but it allows to parse csv with commas in between quotes (used for addresses)... and it takes out the quotes from an array entry as well.
<?php
function parse_csv($file, $options = null) {
$delimiter = empty($options['delimiter']) ? "," : $options['delimiter'];
$to_object = empty($options['to_object']) ? false : true;
$expr="/$delimiter(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/"; // added
$str = $file;
$lines = explode("\n", $str);
$field_names = explode($delimiter, array_shift($lines));
foreach ($lines as $line) {
// Skip the empty line
if (empty($line)) continue;
$fields = preg_split($expr,trim($line)); // added
$fields = preg_replace("/^\"(.*)\"$/s","$1",$fields); //added
//$fields = explode($delimiter, $line);
$_res = $to_object ? new stdClass : array();
foreach ($field_names as $key => $f) {
if ($to_object) {
$_res->{$f} = $fields[$key];
} else {
$_res[$f] = $fields[$key];
}
}
$res[] = $_res;
}
return $res;
}
?>
[EDIT BY danbrown AT php DOT net: Includes bugfixes provided by "Anonymous" on 12-JAN-09 and "joelfromlollypins.com" on 19-JAN-09.]
william dot j dot weir at gmail dot com (2008-09-18 03:19:26)
If your happy enough having just a multi-dimensional array, this should work fine. I had wanted to use the one provided by keananda but it was choking on pr($lines).
<?php
function f_parse_csv($file, $longest, $delimiter) {
$mdarray = array();
$file = fopen($file, "r");
while ($line = fgetcsv($file, $longest, $delimiter)) {
array_push($mdarray, $line);
}
fclose($file);
return $mdarray;
}
?>
$longest is a number that represents the longest line in the csv file as required by fgetcsv(). The page for fgetcsv() said that the longest line could be set to 0 or left out, but I couldn't get it to work without. I just made it extra large when I had to use it.
keananda at gmail dot com (2008-09-15 04:29:18)
For those who need this function but not yet installed in their environment, you can use my function bellow.
You can parse your csv file into an associative array (by default) for each lines, or into an object.
<?php
function parse_csv($file, $options = null) {
$delimiter = empty($options['delimiter']) ? "," : $options['delimiter'];
$to_object = empty($options['to_object']) ? false : true;
$str = file_get_contents($file);
$lines = explode("\n", $str);
pr($lines);
$field_names = explode($delimiter, array_shift($lines));
foreach ($lines as $line) {
// Skip the empty line
if (empty($line)) continue;
$fields = explode($delimiter, $line);
$_res = $to_object ? new stdClass : array();
foreach ($field_names as $key => $f) {
if ($to_object) {
$_res->{$f} = $fields[$key];
} else {
$_res[$f] = $fields[$key];
}
}
$res[] = $_res;
}
return $res;
}
?>
NOTE:
Line number 1 of the csv file will be considered as header (field names).
TODO:
- Enclosure handling
- Escape character handling
- Other features/enhancements as you need
EXAMPLE USE:
Content of /path/to/file.csv:
CODE,COUNTRY
AD,Andorra
AE,United Arab Emirates
AF,Afghanistan
AG,Antigua and Barbuda
<?php
$arr_csv = parse_csv("/path/to/file.csv");
print_r($arr_csv);
?>
// Output:
Array
(
[0] => Array
(
[CODE] => AD
[COUNTRY] => Andorra
)
[1] => Array
(
[CODE] => AE
[COUNTRY] => United Arab Emirates
)
[2] => Array
(
[CODE] => AF
[COUNTRY] => Afghanistan
)
[3] => Array
(
[CODE] => AG
[COUNTRY] => Antigua and Barbuda
)
)
<?php
$obj_csv = parse_csv("/path/to/file.csv", array("to_object" => true));
print_r($obj_csv);
?>
// Output:
Array
(
[0] => stdClass Object
(
[CODE] => AD
[COUNTRY] => Andorra
)
[1] => stdClass Object
(
[CODE] => AE
[COUNTRY] => United Arab Emirates
)
[2] => stdClass Object
(
[CODE] => AF
[COUNTRY] => Afghanistan
)
[3] => stdClass Object
(
[CODE] => AG
[COUNTRY] => Antigua and Barbuda
)
[4] => stdClass Object
(
[CODE] =>
[COUNTRY] =>
)
)
// If you use character | (pipe) as delimiter in your csv file, use:
<?php
$arr_csv = parse_csv("/path/to/file.csv", array("delimiter"=>"|"));
?>
==NSD==
colin_mckinnon at slc dot co dot uk (2008-06-09 09:00:45)
Regarding RFC 4180 - I asked the author about the specifics of the format described which poses a number of technical difficulties in building a parser - his response is shown below:
quote:
Please do keep one thing in mind - this RFC
was meant to define the MIME type for CSV rather than the format. It
happens to be that no format definition existed and I was forced to
define one. As the RFC states:
"This section documents the format that seems to be followed by most
implementations"
csv at rfc dot org (2008-05-05 13:15:14)
RFC 4180 which deals with CSVs states the escape character is supposed to be a double quotation mark: (page 2)
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
peev[dot]alexander at gmail dot com (2008-04-20 14:22:35)
CSV parsing and storage is not that hard to implement - see my example functions ( I believe they do a pretty good job - I use them in a production environment ):
<?php
if( !function_exists("parse_csv") ){
function parse_csv($string){
/* Author : Alexander Peev, posted at PHP.NET */
if( !function_exists("parse_csv_aux") ){
function parse_csv_aux( $string ){
$product = "";
$in_quote = FALSE;
$skipped_quote = FALSE;
for( $i = 0 ; $i < strlen($string) ; $i++ ){
if( $string{$i} == "\"" ){
if($in_quote){
if($skipped_quote){
$product .= "\"";
$skipped_quote = FALSE;
}
else if( !$skipped_quote ){
$skipped_quote = TRUE;
}
}
else{
if($skipped_quote) $skipped_quote = FALSE;
$in_quote = TRUE;
}
}
else if( $string{$i} == ";" ){
if($in_quote){
$product .= ";";
}
else{
$product .= " ; ";
}
}
else{
if($in_quote){
$in_quote = FALSE;
$product .= $string{$i};
}
else{
$product .= $string{$i};
}
}
}
return $product;
}
}
$data = array();
if( is_string($string) && ( stripos($string, "\n") !== FALSE ) ){
$data = explode("\n", parse_csv_aux($string) );
foreach($data as $key => $row){
$columns = array();
//$row = strtr( $row, array( "\";\"" => "\";\"", ";" => " ; " ) );
if( stripos($row, " ; ") !== FALSE ){
$columns = explode( " ; ", $row );
if( !is_array($columns) )$columns = array( strval($columns) );
$data[$key] = $columns;
}
}
return $data;
}
else if( is_string($string) && ( stripos( ($string = parse_csv_aux($string)), " ; ") !== FALSE ) ){
$columns = explode( " ; ", $string );
if( !is_array($columns) )$columns = array( strval($columns) );
return array($columns);
}
else return strval($string);
} /* end function parse_csv */
} /* end not function exists parse_csv */
if( !function_exists("store_csv") ){
function store_csv($data){
/* Author : Alexander Peev, posted at PHP.NET */
if( !function_exists("store_csv_aux") ){
function store_csv_aux( $string ){
$string = strtr( $string, array( "\n" => "" ) );
$product = "";
$in_quote = FALSE;
for( $i = 0 ; $i < strlen($string) ; $i++ ){
if( $string{$i} == "\"" ){
if($in_quote){
$product .= "\"\"";
}
else{
$product .= "\"\"\"";
$in_quote = TRUE;
}
}
else if( $string{$i} == ";" ){
if($in_quote){
$product .= ";";
}
else{
$product .= "\";";
$in_quote = TRUE;
}
}
else{
if($in_quote){
$product .= "\"";
$in_quote = FALSE;
$product .= $string{$i};
}
else{
$product .= $string{$i};
}
}
}
if($in_quote)$product .= "\"";
return $product;
}
}
if(!is_array($data))return strval($data);
$passed_rows = FALSE;
$product = "";
foreach($data as $row){
if( $passed_rows )$product .= "\n";
if( is_array($row) ){
$columns = "";
$passed_cols = FALSE;
foreach($row as $column){
if( $passed_cols )$columns .= ";";
$columns .= store_csv_aux( $column );
$passed_cols =TRUE;
}
$product .= strval($columns);
}
else{
$product .= strtr( strval($row), array("\n" => "") );
}
$passed_rows = TRUE;
}
return $product;
} /* end function store_csv */
} /* end not function exists store_csv */
?>
justin at cam dot org (2007-02-15 22:25:07)
There's a discussion of how to perform this task in the user notes for the split() function.
http://www.php.net/manual/en/function.split.php