Discussion:
Arrays of Strings - problem
(too old to reply)
marktyers
2010-09-06 21:25:02 UTC
Permalink
I have been revisiting Pascal recently by installing free pascal and
digging out some of my old textbooks. I have been trying to solve a
number of simple problems and have mostly met with success but am
stumped with one of them.

The challenge was to read a text file in a line at a time (csv
format). extract the last (third) value on each line and add these up.
The text file contained the following:

fuel,SMITH,55.99
accomodation,SMITH,28.50
food,SMITH,12.99
food,JONES,18.50
books,JONES,5.99
accomodation,SMITH,55.68
books,SMITH,24.98

A Python version of the code to solve this looks like:

total = 0
for line in open('expenses.txt', 'r'):
line = line.strip()
print line
data = line.split(",")
# print data
print data[2].strip()
total = total + float(data[2].strip())
print "TOTAL:",total

My Pascal code looks like this:

program fileread(output);
uses
sysutils;
var
fp : Text;
line : String[30];
len : Integer;
i : Integer;
words : Integer;
wordnum : Integer;
place : Integer;
items : Array of String[30];
begin
Assign(fp, 'expenses.txt');
Reset(fp);
while not eof(fp) do
begin
Readln(fp, line);
writeln(line);
len := length(line);
writeln('letters:'+intToStr(len) );
words := 1;
for i:= 1 to length(line) do
if line[i] = ',' then words := words + 1;
writeln('words: '+intToStr(words) );
SetLength(items, words);
wordnum := 1;
place := 1;
for i:= 1 to length(line) do
begin
if line[i] = ',' then
begin
place := 1;
wordnum := wordnum + 1;
end
else begin
items[wordnum][place] := line[i];
writeln(intToStr(wordnum)+' - '+intToStr(place)+' -
'+line[i]);
place := place + 1;
end;
end;
writeln(items[3]);
writeln();
for i:= 1 to length(items[3]) do
write(items[3][i]);
writeln();
end
end.

The problem is in the splitting of each string to extract the correct
data. Pascal does not seem to have any prebuilt function for this so I
had to write my own algorithm. I think that I have misunderstood
arrays of strings because I produced a working version in C to prove
the validity of my logic:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

main()
{
char line[256];
FILE *fp;
int i;
float total = 0;
fp = fopen("expenses.txt", "r");
while ( fgets(line, 256, fp) != NULL )
{
int len = strlen(line);
if (line[len-1] == '\n') line[len-1] = 0;
printf("%s\n", line);
int words = 1;
for (i=0; i<strlen(line); i++) if (line[i]==',') words++;
//printf("words: %d\n", words);
char items[words][32];
int wordnum = 0;
int place = 0;
for (i=0; i<strlen(line); i++)
{
if (line[i] == ',')
{
place = 0;
wordnum++;
}
else
{
//printf("%d - %d - %c\n", wordnum, place, line[i]);
items[wordnum][place] = line[i];
place++;
}
}
//printf("%s\n", items[2]);
printf("%.2f\n", atof(items[2]) );
total += atof(items[2]);
}
printf("TOTAL: %.2f\n", total);
}

Can anyone explain where I am going wrong.

thanks in advance.

Mark
Wolf Behrenhoff
2010-09-06 22:24:05 UTC
Permalink
Post by marktyers
The challenge was to read a text file in a line at a time (csv
format). extract the last (third) value on each line and add these up.
fuel,SMITH,55.99
accomodation,SMITH,28.50
food,SMITH,12.99
food,JONES,18.50
books,JONES,5.99
accomodation,SMITH,55.68
books,SMITH,24.98
Yes, for this type of task Python and/or Perl are great. I would
probably use:
$ perl -E'$sum+=(split/,/)[2] while <>;say "Sum is $sum"'
Post by marktyers
program fileread(output);
^^^^^^ not needed.
Post by marktyers
uses
sysutils;
var
fp : Text;
line : String[30];
If you are using String[30] then you have "old style" Turbo Pascal
strings, i.e. line[0] contains the length of the string (also known as
ShortString in FPC). Why limit the length to 30? Just just String (and
switch on default usage of "Delphi style" strings, i.e. $H+) or use
AnsiString.
Post by marktyers
else begin
items[wordnum][place] := line[i];
Here you set the place-th char of word wordnum to line[i]. But that does
not update the length of the string! It is even wrong code if the string
has less chars than place. Simply use normal string concatenation
(string:=string+character;).

In C this might work if the whole string is filles with zeroes
beforehand as C determines the length using the null byte.
Post by marktyers
Pascal does not seem to have any prebuilt function for this so I
had to write my own algorithm.
You might want to have a look at the documentation of TStrings and
TStringList :-)

- Wolf
delo
2010-09-07 11:07:00 UTC
Permalink
snip
why you need an array of strings? probably something of untold...
if the need is only summarize the third field just use a large type a just
totalize in it
to isolate the third field you can use the pos function (unsure about it in
fp) searching for a ','
unfortunately the pos function doesnt have a start position ( like basic) so
you can simply delete the part of string to the first ',' then delete to the
second ',' when you find the third ',' then copy the chars from 1 to
position-1 and you have the third field, other consideration about lead or
trail blanks.

another kind of solution is scan the string char by char use the s[i] syntax
(tp) from 1 to the lentgh of string accumulate chars in a temp string then
put the temp string in one of array element when you reach a ',' then
increment an index and proceede (if you want extract all the fields).

long live to pascal - every kind of pascal
bye
delo
Marco van de Voort
2010-09-07 11:31:09 UTC
Permalink
snip
why you need an array of strings? probably something of untold... if the
need is only summarize the third field just use a large type a just
totalize in it to isolate the third field you can use the pos function
(unsure about it in fp) searching for a ',' unfortunately the pos function
doesnt have a start position ( like basic) so you can simply delete the
part of string to the first ',' then delete to the second ',' when you
find the third ',' then copy the chars from 1 to position-1 and you have
the third field, other consideration about lead or trail blanks.
Borland-like Pascal's got a pos with start in Delphi 7 called "POSEX". FPC
also has it.

FPC also has backwards versions (rpos,rposex) as well as set variants:

Function PosEx(const SubStr, S: string; Offset: Cardinal): Integer;
Function PosEx(const SubStr, S: string): Integer;inline; // Offset: Cardinal= 1
Function PosEx(c:char; const S: string; Offset: Cardinal): Integer;


Function RPosEX(C:char;const S : AnsiString;offs:cardinal):Integer; overload;
Function RPosex (Const Substr : AnsiString; Const Source : AnsiString;offs:cardinal) : Integer; overload;
Function RPos(c:char;const S : AnsiString):Integer; overload;
Function RPos (Const Substr : AnsiString; Const Source : AnsiString) : Integer; overload;

function PosSet (const c:TSysCharSet;const s : ansistring ):Integer;
function PosSet (const c:string;const s : ansistring ):Integer;
function PosSetEx (const c:TSysCharSet;const s : ansistring;count:Integer):Integer;
function PosSetEx (const c:string;const s : ansistring;count:Integer):Integer;


but I think Wolf's suggestion to use tstringlist (with strictdelimiter and
delimiter =',') makes more sense.

var x : TStringList;

begin
x:=TStringList.Create;
x.delimiter:=','; x.strictdelimiter:=true;

while ...
begin
readln(f,sometext);
x.delimitedtext:=sometext;
// x[0] .. x[x.count-1] contains the fields here.
end;

This also works in Delphi (2006+)
dik
2010-09-07 19:34:01 UTC
Permalink
PROGRAM READ_CSV ; {

+ fuel,SMITH,55.99
+ accomodation,SMITH,28.50
+ food,SMITH,12.99
+ food,JONES,18.50
+ books,JONES,5.99
+ accomodation,SMITH,55.68
+ books,SMITH,24.98

+ test,TEST,24 98

} const strmax = 255 ;

function pack_line (line : string) : string ;
var i : byte ; l : byte absolute line ;
begin
while (l > 0) and (line [l] in [' ',#0]) do dec (l) ; i := 1 ;
while (i <= l) and (line [i] in [' ',#0]) do inc (i) ;
if i > 1
then pack_line := copy (line, i, strmax)
else pack_line := line ;
end ;

function next_value (var line : string ; seperator : char) :
string ;
var i : byte ;
begin
i := pos (seperator, line) ; if i > 0 then begin
next_value := copy (line, 1 , i - 1 ) ;
line := copy (line, i + 1, strmax) ;
end else begin
next_value := line ;
line := '' ;
end ; end ;

var line : string ; line_num : word ;

function next_value_line : string ;
begin
next_value_line := pack_line (next_value (line,',')) ;
end ;

const width = 13 ;

var expense : string[width] ;
var person : string[width] ;
var amount : string[width] ; total : real ;

var f : text ; r : real ; code : integer ;

BEGIN

assign (f,'READ_CSV.PAS') ; reset (f) ;

line_num := 0 ;
total := 0 ;

while not eof (f) do begin

readln (f,line) ; inc (line_num) ;

if (length (line) > 3) and (line[1] = '+') then begin

line := copy (line,4,strmax) ;

write ('line_num ',line_num:4,' ') ;

expense := next_value_line ;
person := next_value_line ;
amount := next_value_line ;

write (expense : width, ' ') ;
write (person : width, ' ') ;
write (amount : width, ' ') ;

val (amount, r, code) ; if code = 0

then total := total + r
else write ('BAD!') ;

writeln ;

end ; end ;

writeln ; writeln ('total ',total:0:2) ;

END .
Dr J R Stockton
2010-09-07 20:05:16 UTC
Permalink
In comp.lang.pascal.borland message <850185b7-38e5-4800-9496-76246a2c0d2
Post by marktyers
The challenge was to read a text file in a line at a time (csv
format). extract the last (third) value on each line and add these up.
Assuming that the fields are comma-separated, all you need to do, having
read a line, to extract the datum is to scan forwards, character by
character, to find the second comma (or backwards to find the last
comma), then use the Copy function to get the substring, then use the
Val function to get it into, say, a Double.
--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/&c., FAQqy topics & links;
<URL:http://www.merlyn.demon.co.uk/clpb-faq.txt> RAH Prins : c.l.p.b mFAQ;
<URL:ftp://garbo.uwasa.fi/pc/link/tsfaqp.zip> Timo Salmi's Turbo Pascal FAQ.
Rugxulo
2010-09-08 07:30:55 UTC
Permalink
Hi,
Post by marktyers
I have been revisiting Pascal recently by installing free pascal
Nice, ain't it? ;-) I'm just glad somebody, anybody, is posting in
one of the Pascal newsgroups!
Post by marktyers
The challenge was to read a text file in a line at a time (csv
format). extract the last (third) value on each line and add these > up.
Sounds simple enough. Not necessarily the best reason to use Pascal,
but not the worst either. Personally I would use GAWK, which has a
very similar example in its manual. But Pascal works too.

EDIT: Oops, this is "standard" ISO 7185 Pascal, heh, which might mean
this is the wrong newsgroup. (Sue me!) However, I think CBFalconer's
string package lets you fake get/put, so nyah. I'm just using GPC here
(which is "mostly" BP compatible too). ;-)

--------------------------------------------------
{$classic-pascal}
program testprog(input,output,testfile);

var testfile: text; tempreal, sum: real;

begin { testprog }

sum := 0; reset(testfile);

while not eof(testfile) do
begin
while testfile^ <> ',' do get(testfile); get(testfile);
while testfile^ <> ',' do get(testfile); get(testfile);
readln(testfile,tempreal); writeln(tempreal,sum);
sum := sum + tempreal;
end;

writeln('The total sum is',sum);

end. { testprog }
--------------------------------------------------

Not too complex, is it? I'm a fairly sucky programmer though, but it
seems to work:

C:\blah [ GPC ] >type expenses.txt
fuel,SMITH,55.99
accomodation,SMITH,28.50
food,SMITH,12.99
food,JONES,18.50
books,JONES,5.99
accomodation,SMITH,55.68
books,SMITH,24.98

C:\blah [ GPC ] >testprog --gpc-rts -ntestfile:expenses.txt
5.5990000000000002e+01 0.0000000000000000e+00
2.8500000000000000e+01 5.5990000000000002e+01
1.2990000000000000e+01 8.4490000000000009e+01
1.8500000000000000e+01 9.7480000000000004e+01
5.9900000000000002e+00 1.1598000000000000e+02
5.5680000000000000e+01 1.2197000000000000e+02
2.4980000000000000e+01 1.7765000000000001e+02
The total sum is 2.0263000000000000e+02

C:\blah [ GPC ] >\jkw\rugxulo\djgpp\bin\gawk -F, "{ sum +=
$3; }; END { print sum }" expenses.txt
202.63
marktyers
2010-09-13 21:05:45 UTC
Permalink
Hi,
Post by marktyers
I have been revisiting Pascal recently by installing free pascal
Nice, ain't it?  ;-)   I'm just glad somebody, anybody, is posting in
one of the Pascal newsgroups!
Post by marktyers
The challenge was to read a text file in a line at a time (csv
format). extract the last (third) value on each line and add these > up.
Sounds simple enough. Not necessarily the best reason to use Pascal,
but not the worst either. Personally I would use GAWK, which has a
very similar example in its manual. But Pascal works too.
EDIT: Oops, this is "standard" ISO 7185 Pascal, heh, which might mean
this is the wrong newsgroup. (Sue me!) However, I think CBFalconer's
string package lets you fake get/put, so nyah. I'm just using GPC here
(which is "mostly" BP compatible too).  ;-)
--------------------------------------------------
{$classic-pascal}
program testprog(input,output,testfile);
var testfile: text; tempreal, sum: real;
begin { testprog }
sum := 0; reset(testfile);
while not eof(testfile) do
begin
  while testfile^ <> ',' do get(testfile); get(testfile);
  while testfile^ <> ',' do get(testfile); get(testfile);
  readln(testfile,tempreal); writeln(tempreal,sum);
  sum := sum + tempreal;
end;
writeln('The total sum is',sum);
end. { testprog }
--------------------------------------------------
Not too complex, is it? I'm a fairly sucky programmer though, but it
C:\blah [ GPC ] >type expenses.txt
fuel,SMITH,55.99
accomodation,SMITH,28.50
food,SMITH,12.99
food,JONES,18.50
books,JONES,5.99
accomodation,SMITH,55.68
books,SMITH,24.98
C:\blah [ GPC ] >testprog --gpc-rts -ntestfile:expenses.txt
 5.5990000000000002e+01 0.0000000000000000e+00
 2.8500000000000000e+01 5.5990000000000002e+01
 1.2990000000000000e+01 8.4490000000000009e+01
 1.8500000000000000e+01 9.7480000000000004e+01
 5.9900000000000002e+00 1.1598000000000000e+02
 5.5680000000000000e+01 1.2197000000000000e+02
 2.4980000000000000e+01 1.7765000000000001e+02
The total sum is 2.0263000000000000e+02
C:\blah [ GPC ] >\jkw\rugxulo\djgpp\bin\gawk -F, "{ sum +=
 $3; }; END { print sum }" expenses.txt
202.63
Cheers for the solution. Not sure how Free Pascal differs from
'classic pascal' but I get a few errors.

examples > fpc fileread3.pas
Free Pascal Compiler version 2.4.0 [2009/12/20] for i386
Copyright (c) 1993-2009 by Florian Klaempfl
Target OS: Darwin for i386
Compiling fileread3.pas
fileread3.pas(7,19) Error: Illegal qualifier
fileread3.pas(7,32) Error: Identifier not found "get"
fileread3.pas(7,47) Error: Identifier not found "get"
fileread3.pas(8,19) Error: Illegal qualifier
fileread3.pas(8,32) Error: Identifier not found "get"
fileread3.pas(8,47) Error: Identifier not found "get"
fileread3.pas(13,17) Fatal: There were 6 errors compiling module,
stopping
Fatal: Compilation aborted
Error: /usr/local/bin/ppc386 returned an error exitcode (normal if you
did not specify a source file to be compiled)
examples >

It looks like the function get is not recognised as well as the use of
pointers. Any ideas?

Mark
Rugxulo
2010-09-13 22:41:48 UTC
Permalink
Hi,
Post by marktyers
Post by Rugxulo
C:\blah [ GPC ] >\jkw\rugxulo\djgpp\bin\gawk -F, "{ sum +=
 $3; }; END { print sum }" expenses.txt
202.63
Cheers for the solution. Not sure how Free Pascal differs from
'classic pascal' but I get a few errors.
Classic Pascal is just older, that's all, the "original", so to speak.
FPC follows the more popular (but slightly weird) TP/BP/Delphi
line(s). Honestly, just use whatever your compiler supports, but we've
got FPC and GPC and P5 and ... freely available, so it's all
good. ;-)
Post by marktyers
Free Pascal Compiler version 2.4.0 [2009/12/20] for i386
Fatal: Compilation aborted
It looks like the function get is not recognised as well as the use of
pointers. Any ideas?
DON'T PANIC! :-))

Probably my bad for posting "standard" in the "wrong" newsgroup, heh.
Here's a fixed version:

------------------------------
program testprog; {tested w/ FPC 2.4.0}

var testfile: text; tempreal, sum: real; i: integer; ch: char;

procedure skipfield; begin
repeat read(testfile,ch) until ch=','
end;

begin { testprog }

sum := 0; assign(testfile,'expenses.txt'); reset(testfile);

while not eof(testfile) do
begin
for i := 1 to 2 do skipfield;
readln(testfile,tempreal); writeln(tempreal:10:2,sum:10:2);
sum := sum + tempreal;
end;

writeln('The total sum is',sum:10:2);

close(testfile);

end. { testprog }
------------------------------

Same EXPENSES.TXT data, so here's the output (though note I formatted
it better here, easier to read, IMHO):

55.99 0.00
28.50 55.99
12.99 84.49
18.50 97.48
5.99 115.98
55.68 121.97
24.98 177.65
The total sum is 202.63
marktyers
2010-09-14 19:28:46 UTC
Permalink
Post by Rugxulo
Hi,
Post by marktyers
Post by Rugxulo
C:\blah [ GPC ] >\jkw\rugxulo\djgpp\bin\gawk -F, "{ sum +=
 $3; }; END { print sum }" expenses.txt
202.63
Cheers for the solution. Not sure how Free Pascal differs from
'classic pascal' but I get a few errors.
Classic Pascal is just older, that's all, the "original", so to speak.
FPC follows the more popular (but slightly weird) TP/BP/Delphi
line(s). Honestly, just use whatever your compiler supports, but we've
got FPC and GPC and P5 and ... freely available, so it's all
good.   ;-)
Post by marktyers
Free Pascal Compiler version 2.4.0 [2009/12/20] for i386
Fatal: Compilation aborted
It looks like the function get is not recognised as well as the use of
pointers. Any ideas?
DON'T PANIC!   :-))
Probably my bad for posting "standard" in the "wrong" newsgroup, heh.
------------------------------
program testprog; {tested w/ FPC 2.4.0}
var testfile: text; tempreal, sum: real; i: integer; ch: char;
procedure skipfield; begin
repeat read(testfile,ch) until ch=','
end;
begin { testprog }
sum := 0; assign(testfile,'expenses.txt'); reset(testfile);
while not eof(testfile) do
begin
  for i := 1 to 2 do skipfield;
  readln(testfile,tempreal); writeln(tempreal:10:2,sum:10:2);
  sum := sum + tempreal;
end;
writeln('The total sum is',sum:10:2);
close(testfile);
end. { testprog }
------------------------------
Same EXPENSES.TXT data, so here's the output (though note I formatted
     55.99      0.00
     28.50     55.99
     12.99     84.49
     18.50     97.48
      5.99    115.98
     55.68    121.97
     24.98    177.65
The total sum is    202.63
Thanks massively! works a treat so all I need to do now is study the
code to work out how it works.

Can't thank you all enough, especially Rugxulo. Amazed at the effort
you put in to help me. This has been really humbling and in massive
contrast to the response I got from comp.lang.cobol over a different
issue.

cheers
Wolf Behrenhoff
2010-09-14 20:14:36 UTC
Permalink
Post by Rugxulo
DON'T PANIC! :-))
Got a towel? :-)
Post by Rugxulo
Probably my bad for posting "standard" in the "wrong" newsgroup, heh.
procedure skipfield; begin
repeat read(testfile,ch) until ch=','
end;
Reading character by character might be slow (although there might be
some internal buffers). I would rather use ReadLn and parse the string.
Because this is FPC, you or the OP could also have a look at the
TFileStream class.

What happens if the "testfile" does not contain a ','? So this lacks
basic error handling. Further, you cannot reuse the "skipfield"
function. As my last comment, you are using a global variable where it
isn't necessary ("ch" is only used in one function).

Wolf
marktyers
2010-09-14 20:48:30 UTC
Permalink
On Sep 14, 9:14 pm, Wolf Behrenhoff
Post by Wolf Behrenhoff
Post by Rugxulo
DON'T PANIC!   :-))
Got a towel? :-)
Post by Rugxulo
Probably my bad for posting "standard" in the "wrong" newsgroup, heh.
procedure skipfield; begin
repeat read(testfile,ch) until ch=','
end;
Reading character by character might be slow (although there might be
some internal buffers). I would rather use ReadLn and parse the string.
Because this is FPC, you or the OP could also have a look at the
TFileStream class.
What happens if the "testfile" does not contain a ','? So this lacks
basic error handling. Further, you cannot reuse the "skipfield"
function. As my last comment, you are using a global variable where it
isn't necessary ("ch" is only used in one function).
Wolf
Saw a mention of the TFileStream class and may need to investigate
further. Was hoping for a solution using only procedural approaches
but looks as if I may have to start using objects after all.

cheers
Rugxulo
2010-09-15 00:22:24 UTC
Permalink
Hi,

On Sep 14, 3:14 pm, Wolf Behrenhoff
Post by Wolf Behrenhoff
Post by Rugxulo
DON'T PANIC!   :-))
Got a towel? :-)
You're a towel!
Post by Wolf Behrenhoff
Post by Rugxulo
procedure skipfield; begin
repeat read(testfile,ch) until ch=','
end;
Reading character by character might be slow (although there might be
some internal buffers).
Definitely true, but the sample data was short enough that it didn't
matter.
Post by Wolf Behrenhoff
I would rather use ReadLn and parse the string.
Reminds me of REXX's PARSE (yet another scripting language that could
handle this).
Post by Wolf Behrenhoff
Because this is FPC, you or the OP could also have a look at the
TFileStream class.
Never used it, dunno, sounds like overkill, but your experience is
wider than mine.
Post by Wolf Behrenhoff
What happens if the "testfile" does not contain a ','? So this lacks
basic error handling.
Right, the file might not even exist. Also I hardcoded the filename
(boo hiss) instead of using ParamStr(1). MarcoV probably would also
tell me to use AssignFile instead.
Post by Wolf Behrenhoff
Further, you cannot reuse the "skipfield"
function.
Right, I hardcoded it to only check for commas.
Post by Wolf Behrenhoff
As my last comment, you are using a global variable where it
isn't necessary ("ch" is only used in one function).
Good point, forgot about that (though I didn't need it for ISO 7185).
Also I truly didn't include any useful code comments. And the
indentation left much to be desired.

In short, there's plenty of room for improvement on any project. Keep
on truckin'! Little by little does the trick.

Loading...