473,324 Members | 2,214 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

How to count lines in a text file ?

Hi all.

I'm trying to write a program that:
1) Ask me what file I want to count number of lines in, and then counts the
lines and writes the answear out.

2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()

3) I think that I have to use a for loop ( something like: for line in text:
count +=1)
Or maybee I have to do create a def: something like: ( def loop(line,
count)), but not sure how to do this properly.
And then perhaps use the readlines() function, but again not quite sure how
to do this. So do one of you have a good idea.

Thanks for all help

Jul 18 '05 #1
22 61302
Oh I just did it.

Just used the line:

print "%d lines in your choosen file" % len(open("test.txt").readlines())

Thanks though :)
"Ling Lee" <ja*****@mail.trillegaarden.dk> wrote in message
news:41***********************@nntp02.dk.telia.net ...
Hi all.

I'm trying to write a program that:
1) Ask me what file I want to count number of lines in, and then counts
the lines and writes the answear out.

2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()

3) I think that I have to use a for loop ( something like: for line in
text: count +=1)
Or maybee I have to do create a def: something like: ( def loop(line,
count)), but not sure how to do this properly.
And then perhaps use the readlines() function, but again not quite sure
how to do this. So do one of you have a good idea.

Thanks for all help

Jul 18 '05 #2
Yes, you need a for loop, and a count variable. You can count in several
ways. File objects are iterable, and they iterate over the lines in the
file. readlines() returns a list of the lines, which will have the same
effect, but because it builds the entire list in memory first, it uses
more memory. Example:

########

filename = raw_input('file? ')
file = open(filename)

lines = 0
for line in file:
# line is ignored here, but it contains each line of the file,
# including the newline
lines += 1

print '%r has %r lines' % (filename, lines)

########

another alternative is to use the standard posix program "wc" with the
-l option, but this isn't Python.

On Mon, Sep 20, 2004 at 03:18:53PM +0200, Ling Lee wrote:
Hi all.

I'm trying to write a program that:
1) Ask me what file I want to count number of lines in, and then counts the
lines and writes the answear out.

2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()

3) I think that I have to use a for loop ( something like: for line in text:
count +=1)
Or maybee I have to do create a def: something like: ( def loop(line,
count)), but not sure how to do this properly.
And then perhaps use the readlines() function, but again not quite sure how
to do this. So do one of you have a good idea.

Thanks for all help

Jul 18 '05 #3
Ling Lee <ja*****@mail.trillegaarden.dk> wrote:
Oh I just did it.

Just used the line:

print "%d lines in your choosen file" % len(open("test.txt").readlines())

Thanks though :)


You're welcome;-). However, this approach reads all of the file into
memory at once. If you must be able to deal with humungoug files, too
big to fit in memory at once, try something like:

numlines = 0
for line in open('text.txt'): numlines += 1
Alex
Jul 18 '05 #4
Ling Lee wrote:
Hi all.

I'm trying to write a program that:
1) Ask me what file I want to count number of lines in, and then counts the
lines and writes the answear out.

2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()

3) I think that I have to use a for loop ( something like: for line in text:
count +=1)
Or maybee I have to do create a def: something like: ( def loop(line,
count)), but not sure how to do this properly.
And then perhaps use the readlines() function, but again not quite sure how
to do this. So do one of you have a good idea.

Thanks for all help

text = in_file.readlines()
print len(text)

HtH, Roland
Jul 18 '05 #5
Thanks for you replies :)

I just ran the program with a different file name, and it only counts the
number of lines in the file named test.txt. I try to give it a nother try
with your input...

Thanks again... for the fast reply... Hope I get it right this time :)

"Phil Frost" <in****@bitglue.com> wrote in message
news:ma**************************************@pyth on.org...
Yes, you need a for loop, and a count variable. You can count in several
ways. File objects are iterable, and they iterate over the lines in the
file. readlines() returns a list of the lines, which will have the same
effect, but because it builds the entire list in memory first, it uses
more memory. Example:

########

filename = raw_input('file? ')
file = open(filename)

lines = 0
for line in file:
# line is ignored here, but it contains each line of the file,
# including the newline
lines += 1

print '%r has %r lines' % (filename, lines)

########

another alternative is to use the standard posix program "wc" with the
-l option, but this isn't Python.

On Mon, Sep 20, 2004 at 03:18:53PM +0200, Ling Lee wrote:
Hi all.

I'm trying to write a program that:
1) Ask me what file I want to count number of lines in, and then counts
the
lines and writes the answear out.

2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()

3) I think that I have to use a for loop ( something like: for line in
text:
count +=1)
Or maybee I have to do create a def: something like: ( def loop(line,
count)), but not sure how to do this properly.
And then perhaps use the readlines() function, but again not quite sure
how
to do this. So do one of you have a good idea.

Thanks for all help

Jul 18 '05 #6
Phil Frost wrote:
another alternative is to use the standard posix program "wc" with the
-l option, but this isn't Python.


Not the same thing. wc -l counts newline bytes, not "real" lines.
Erik
Jul 18 '05 #7
Ling Lee said unto the world upon 2004-09-20 09:36:
Thanks for you replies :)

I just ran the program with a different file name, and it only counts the
number of lines in the file named test.txt. I try to give it a nother try
with your input...

Thanks again... for the fast reply... Hope I get it right this time :)


<SNIP>
On Mon, Sep 20, 2004 at 03:18:53PM +0200, Ling Lee wrote:
Hi all.

I'm trying to write a program that:
1) Ask me what file I want to count number of lines in, and then counts
the
lines and writes the answear out.

2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()

3) I think that I have to use a for loop ( something like: for line in
text:
count +=1)
Or maybee I have to do create a def: something like: ( def loop(line,
count)), but not sure how to do this properly.
And then perhaps use the readlines() function, but again not quite sure
how
to do this. So do one of you have a good idea.

Thanks for all help


Hi Ling Lee,

you've got:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")

What this does is take the user input and assign it the name "in_file"
and then promptly reassigns the name "in_file" to the output of
open("test.txt","r").

So, you never make use of the input, and keep asking it to open test.txt
instead.

Try something like:

in_file_name = raw_input("What is the file you want to open: ")
in_file = open(in_file_name,"r")

Also, and I say this as a fellow newbie, you might want to check out the
Tutor list: <http://mail.python.org/pipermail/tutor/>

HTH,

Brian vdB

Jul 18 '05 #8
Ling Lee wrote:
2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()
You have two different objects related to the file.
One is the filename (the result of calling raw_input) and
the other is the file handle (the result of calling open).
You are using same variable name for both of them. You
really should make them different.

First you get the file name and reference it by the variable
named 'in_file'. Next you use another filename ("test.txt")
for the open call. This returns a file handle, but not
a file handle to the file named in 'in_file'.

You then change things so that 'in_file' no longer refers
to the filename but now refers to the file handle.

A nicer solution is to use one variable name for the name
(like "in_filename") and another for the handle (you can
keep "in_file" if you want to). In the following I
reformatted it so the example fits in under 80 colums

in_filename = raw_input("What is the name of the file "
"you want to open: ")
in_file = open(in_filename,"r")
text = in_file.read()
Now the in_file.read() reads all of the file into memory. There
are several ways to count the number of lines. The first is
to count the number of newline characters. Because the newline
character is special, it's most often written as what's called
an escape code. In this case, "\n". Others are backspace ("\b")
and beep ("\g"), and backslash ("\\") since otherwise there's
no way to get the single character "\".

Here's how to cound the number of newlines in the text

num_lines = text.count("\n")

print "There are", num_lines, "in", in_filename
This will work for almost every file except for one where
the last line doesn't end with a newline. It's rare, but
it does happen. To fix that you need to see if the
text ends with a newline and if it doesn't then add one
more to the count
num_lines = text.count("\n")
if not text.endswith("\n"):
num_lines = num_lines + 1

print "There are", num_lines, "in", in_filename

3) I think that I have to use a for loop ( something like
for line in text: count +=1)


Something like that will work. When you say "for xxxx in string"
it loops through every character in the string, and not
every line. What you need is some way to get the lines.

One solution is to use the 'splitlines' method of strings.
This knows how to deal with the "final line doesn't end with
a newline" case and return a list of all the lines. You
can use it like this

count = 0
for line in text.splitlines():
count = count + 1

or, since splitlines() returns a list of lines you can
also do

count = len(text.splitlines())

It turns out that reading lines from a file is very common.
When you say "for xxx in file" it loops through every line
in the file. This is not a list so you can't say

len(open(in_filename, "r")) # DOES NOT WORK

instead you need to have the explicit loop, like this

count = 0
for line in open(in_filename, "r")):
count = count + 1

An advantage to this approach is that it doesn't read
the whole file into memory. That's only a problems
if you have a large file. Try counting the number of
lines in a 1.5 GB file!

By the way, the "r" is the default for the a file open.
Most people omit it from the parameter list and just use

open(in_filename)

Hope this helped!

By the way, you might want to look at the "Beginner's
Guide to Python" page at http://python.org/topics/learn/ .
It has pointers to resources that might help, including
the tutor mailing list meant for people like you who
are learning to program in Python.

Andrew
da***@dalkescientific.com
Jul 18 '05 #9
Thanks for explaining it that well, really makes sense now :)

Cheers....
"Andrew Dalke" <ad****@mindspring.com> wrote in message
news:ek**************@newsread3.news.pas.earthlink .net...
Ling Lee wrote:
2) I made the first part like this:

in_file = raw_input("What is the name of the file you want to open: ")
in_file = open("test.txt","r")
text = in_file.read()


You have two different objects related to the file.
One is the filename (the result of calling raw_input) and
the other is the file handle (the result of calling open).
You are using same variable name for both of them. You
really should make them different.

First you get the file name and reference it by the variable
named 'in_file'. Next you use another filename ("test.txt")
for the open call. This returns a file handle, but not
a file handle to the file named in 'in_file'.

You then change things so that 'in_file' no longer refers
to the filename but now refers to the file handle.

A nicer solution is to use one variable name for the name
(like "in_filename") and another for the handle (you can
keep "in_file" if you want to). In the following I
reformatted it so the example fits in under 80 colums

in_filename = raw_input("What is the name of the file "
"you want to open: ")
in_file = open(in_filename,"r")
text = in_file.read()
Now the in_file.read() reads all of the file into memory. There
are several ways to count the number of lines. The first is
to count the number of newline characters. Because the newline
character is special, it's most often written as what's called
an escape code. In this case, "\n". Others are backspace ("\b")
and beep ("\g"), and backslash ("\\") since otherwise there's
no way to get the single character "\".

Here's how to cound the number of newlines in the text

num_lines = text.count("\n")

print "There are", num_lines, "in", in_filename
This will work for almost every file except for one where
the last line doesn't end with a newline. It's rare, but
it does happen. To fix that you need to see if the
text ends with a newline and if it doesn't then add one
more to the count
num_lines = text.count("\n")
if not text.endswith("\n"):
num_lines = num_lines + 1

print "There are", num_lines, "in", in_filename

3) I think that I have to use a for loop ( something like
for line in text: count +=1)


Something like that will work. When you say "for xxxx in string"
it loops through every character in the string, and not
every line. What you need is some way to get the lines.

One solution is to use the 'splitlines' method of strings.
This knows how to deal with the "final line doesn't end with
a newline" case and return a list of all the lines. You
can use it like this

count = 0
for line in text.splitlines():
count = count + 1

or, since splitlines() returns a list of lines you can
also do

count = len(text.splitlines())

It turns out that reading lines from a file is very common.
When you say "for xxx in file" it loops through every line
in the file. This is not a list so you can't say

len(open(in_filename, "r")) # DOES NOT WORK

instead you need to have the explicit loop, like this

count = 0
for line in open(in_filename, "r")):
count = count + 1

An advantage to this approach is that it doesn't read
the whole file into memory. That's only a problems
if you have a large file. Try counting the number of
lines in a 1.5 GB file!

By the way, the "r" is the default for the a file open.
Most people omit it from the parameter list and just use

open(in_filename)

Hope this helped!

By the way, you might want to look at the "Beginner's
Guide to Python" page at http://python.org/topics/learn/ .
It has pointers to resources that might help, including
the tutor mailing list meant for people like you who
are learning to program in Python.

Andrew
da***@dalkescientific.com

Jul 18 '05 #10
On Mon, 20 Sep 2004 15:29:18 +0200, rumours say that al*****@yahoo.com
(Alex Martelli) might have written:
Ling Lee <ja*****@mail.trillegaarden.dk> wrote:
Oh I just did it.

Just used the line:

print "%d lines in your choosen file" % len(open("test.txt").readlines())

Thanks though :)


[Alex]You're welcome;-). However, this approach reads all of the file into
memory at once. If you must be able to deal with humungoug files, too
big to fit in memory at once, try something like:

numlines = 0
for line in open('text.txt'): numlines += 1


And a short story of premature optimisation follows...

Saw the plain code above and instantly the programmer's instinct of
optimisation came into action... we all know that C loops are faster
than python loops, right? So I spent 2 minutes of my time to write the
following 'clever' function:

def count_lines(filename):
fp = open(filename)
count = 1 + max(enumerate(fp))[0]
fp.close()
return count

Proud of my programming skills, I timed it against another function
containing Alex' code. Guess what? My code was slower... (and I should
put a try: except Value: clause to cater for empty files)

Of course, on second thought, the reason must be that enumerate
generates one tuple for every line in the file; in any case, I'll mark
this rule:

C loops are *always* faster than python loops, unless the loop does
something useful ;-) in the latter case, timeit.py is your friend.
--
TZOTZIOY, I speak England very best,
"Tssss!" --Brad Pitt as Achilles in unprecedented Ancient Greek
Jul 18 '05 #11
Christos TZOTZIOY Georgiou <tz**@sil-tec.gr> wrote:
...
memory at once. If you must be able to deal with humungoug files, too
big to fit in memory at once, try something like:

numlines = 0
for line in open('text.txt'): numlines += 1
And a short story of premature optimisation follows...


Thanks for sharing!
def count_lines(filename):
fp = open(filename)
count = 1 + max(enumerate(fp))[0]
fp.close()
return count
Cute, actually!
containing Alex' code. Guess what? My code was slower... (and I should
put a try: except Value: clause to cater for empty files)

Of course, on second thought, the reason must be that enumerate
generates one tuple for every line in the file; in any case, I'll mark


I thought built-ins could recycle their tuples, sometimes, but you may
in fact be right (we should check with Raymong Hettinger, though).

With 2.4, I measure 30 msec with your approach, and 24 with mine, to
count the 45425 lines of /usr/share/dict/words on my Linux box
(admittedly not a creat example of 'humungous file'); and similarly
kjv.txt, a King James' Bible (31103 lines, but 10 times the size of the
words file), 41 with yours, 36 with mine. They're pretty close. At
least they beat len(file(...).readlines()), which takes 33 on words, 62
on kjv.txt...

If one is really in a hurry counting lines, a dedicated C extension
might help. E.g.:

static PyObject *count(PyObject *self, PyObject *args)
{
PyObject* seq;
PyObject* item;
int result;

/* get one argument as an iterator */
if(!PyArg_ParseTuple(args, "O", &seq))
return 0;
seq = PyObject_GetIter(seq);
if(!seq)
return 0;

/* count items */
result = 0;
while((item=PyIter_Next(seq))) {
result += 1;
Py_DECREF(item);
}

/* clean up and return result */
Py_DECREF(seq);
return Py_BuildValue("i", result);
}

Using this count-items-in-iterable thingy, words takes 10 msec, kjv
takes 26.

Happier news is that one does NOT have to learn C to gain this.
Consider the Pyrex file:

def count(seq):
cdef int i
it = iter(seq)
i = 0
for x in it:
i = i + 1
return i

pyrexc'ing this and building the Python extension from the resulting C
file gives just about the same performance as the pure-C coding: 10 msec
on words, 26 on kjv, the same to within 1% as pure-C coding (there is a
systematic speedup of a bit less than 1% for the C-coded function).

And if one doesn't even want to use pyrex? Why, that's what psyco is
for...:

import psyco
def count(seq):
it = iter(seq)
i = 0
for x in it:
i = i + 1
return i
psyco.bind(seq)

Again to the same level of precision, the SAME numbers, 10 and 26 msec
(actually, in this case the less-than-1% systematic bias is in favour of
psyco compared to pure-C coding...!-)
So: your instinct that C-coded loops are faster weren't too badly off...
and you can get the same performance (just about) with Pyrex or (on an
intel or compatible processor, only -- sigh) with psyco.
Alex
Jul 18 '05 #12
On Mon, 20 Sep 2004 15:29:18 +0200, al*****@yahoo.com (Alex Martelli) wrote:
Ling Lee <ja*****@mail.trillegaarden.dk> wrote:
Oh I just did it.

Just used the line:

print "%d lines in your choosen file" % len(open("test.txt").readlines())

Thanks though :)


You're welcome;-). However, this approach reads all of the file into
memory at once. If you must be able to deal with humungoug files, too
big to fit in memory at once, try something like:

numlines = 0
for line in open('text.txt'): numlines += 1


I don't have 2.4, but how would that compare with a generator expression like (untested)

sum(1 for line in open('text.txt'))

or, if you _are_ willing to read in the whole file,

open('text.txt').read().count('\n')

Regards,
Bengt Richter
Jul 18 '05 #13
Bengt Richter <bo**@oz.net> wrote:
...
memory at once. If you must be able to deal with humungoug files, too
big to fit in memory at once, try something like:

numlines = 0
for line in open('text.txt'): numlines += 1
I don't have 2.4


2.4a3 is freely available for download and everybody's _encouraged_ to
download it and try it out -- come on, don't be the last one to!-)
but how would that compare with a generator expression like (untested)

sum(1 for line in open('text.txt'))

or, if you _are_ willing to read in the whole file,

open('text.txt').read().count('\n')


I'm not on the same machine as when I ran the other timing measurements
(including pyrex &c) but here's the results on this one machine...:

$ wc /usr/share/dict/words
234937 234937 2486825 /usr/share/dict/words
$ python2.4 ~/cb/timeit.py "numlines=0
for line in file('/usr/share/dict/words'): numlines+=1"
10 loops, best of 3: 3.08e+05 usec per loop
$ python2.4 ~/cb/timeit.py
"file('/usr/share/dict/words').read().count('\n')"
10 loops, best of 3: 2.72e+05 usec per loop
$ python2.4 ~/cb/timeit.py
"len(file('/usr/share/dict/words').readlines())"
10 loops, best of 3: 3.25e+05 usec per loop
$ python2.4 ~/cb/timeit.py "sum(1 for line in
file('/usr/share/dict/words'))"
10 loops, best of 3: 4.42e+05 usec per loop

Last but not least...:

$ python2.4 ~/cb/timeit.py -s'import cou'
"cou.cou(file('/usr/share/dict/words'))"
10 loops, best of 3: 2.05e+05 usec per loop

where cou.pyx is the pyrex program I've already shown on the other
subthread. Using the count.c I've also shown takes 2.03e+05 usec.
(Can't try psyco here, not an intel-like cpu).
Summary: "sum(1 for ...)" is no speed daemon; the plain loop is best
among the pure-python approaches for files that can't fit in memory. If
the file DOES fit in memory, read().count('\n') is faster, but
len(...readlines()) is slower. Pyrex rocks, essentially removing the
need for C-coded extensions (less than a 1% advantage) -- and so does
psyco, but not if you're using a Mac (quick, somebody gift Armin Rigo
with a Mac before it's too late...!!!).
Alex
Jul 18 '05 #14
Bengt Richter wrote:
or, if you _are_ willing to read in the whole file,

open('text.txt').read().count('\n')


Except the last line might not have a terminal newline.

Andrew
da***@dalkescientific.com
Jul 18 '05 #15
Alex Martelli wrote:
If one is really in a hurry counting lines, a dedicated C extension
might help. E.g.:

static PyObject *count(PyObject *self, PyObject *args) ... Using this count-items-in-iterable thingy


There's been a few times I've wanted a function like
this. I keep expecting that len(iterable) will work,
but of course it doesn't.

Would itertools.len(iterable) be useful? More likely
the name collision with len itself would be a problem,
so perhaps itertools.length(iterable).
BTW, I saw itertools.count and figured that might be
it. Nope. And don't try the following
import itertools
itertools.count(5) count(5) print list(_)


:)

Andrew
da***@dalkescientific.com
Jul 18 '05 #16
On Wed, 22 Sep 2004 19:48:21 GMT, Andrew Dalke <ad****@mindspring.com> wrote:
Bengt Richter wrote:
or, if you _are_ willing to read in the whole file,

open('text.txt').read().count('\n')


Except the last line might not have a terminal newline.

I _knew_ I should have mentioned that ;-)

Regards,
Bengt Richter
Jul 18 '05 #17
Andrew Dalke <ad****@mindspring.com> wrote:
Alex Martelli wrote:
If one is really in a hurry counting lines, a dedicated C extension
might help. E.g.:

static PyObject *count(PyObject *self, PyObject *args) ...
Using this count-items-in-iterable thingy


There's been a few times I've wanted a function like


Me too, that's why I wrote the C and Pyrex versions:-).
this. I keep expecting that len(iterable) will work,
but of course it doesn't.
Yep -- it would probably be too risky to have len(...) consume a whole
iterator, beginning users wouldn't expect that and might get burnt.

Would itertools.len(iterable) be useful? More likely
the name collision with len itself would be a problem,
so perhaps itertools.length(iterable).


Unfortunately, itertools's functions are there to produce iterators, not
to consume them. I doubt Raymond Hettinger, itertools' guru, would
approve of changing that (though one could surely ask him, and if he
surprised me, I guess the change might get in).

There's currently no good single place for 'accumulators', i.e.
consumers of iterators which produce scalars or thereabouts -- sum, max,
and min, are built-ins; other useful accumulators can be found in heapq
(because they're implemented via a heap...)... and there's nowhere to
put the obviously needed "trivial" accumulators, such as average,
median, variance, count...

A "stats" module was proposed, but also shot down (presumably people
have more ambitious ideas about 'statistics' than there simple
accumulators, alas -- I'm not sure exactly what the problem was).
Alex
Jul 18 '05 #18
Andrew Dalke <ad****@mindspring.com> wrote:
Bengt Richter wrote:
or, if you _are_ willing to read in the whole file,

open('text.txt').read().count('\n')


Except the last line might not have a terminal newline.


....and wc would then not count that non-line as a line, so why should
we...? Witness...:

$ echo -n 'bu'>em
$ wc em
0 1 2 em

zero lines, one word, two characters: seems right to me.
Alex
Jul 18 '05 #19
Alex Martelli wrote:
....and wc would then not count that non-line as a line, so why should
we...? Witness...:

'Cause that's what Python does. Witness:

% echo -n 'bu' | python -c \
? 'import sys; print len(sys.stdin.readlines())'
1

;)

Andrew
da***@dalkescientific.com
Jul 18 '05 #20
Andrew Dalke <ad****@mindspring.com> wrote:
Alex Martelli wrote:
....and wc would then not count that non-line as a line, so why should
we...? Witness...:
'Cause that's what Python does. Witness:


If you tell it to count non-lines too (pieces that don't end with an
endline marker), it does, of course:
% echo -n 'bu' | python -c \
? 'import sys; print len(sys.stdin.readlines())'
1


But that's just because you told it to.

To reproduce wc's behavior, you have to exclude non-lines -- use
len([ l for l in sys.stdin if l.endswith('\n') ]) for example. Or, the
simpler .count('\n') approach.

I suspect somebody who asks the subject question wants to reproduce wc's
counting behavior. Of course, it _is_ an imprecise spec they're giving.
Alex
Jul 18 '05 #21
Alex Martelli wrote:
If you tell it to count non-lines too (pieces that don't end with an
endline marker), it does, of course:
My reply was meant to be a bit of a jest, pointing out that
I'm using Python's definition of a line. Otherwise if
lines must end with a newline then the method should be
named "readlines_and_any_trailing_text()"

Since you used

numlines=0
for line in file('/usr/share/dict/words'): numlines+=1

as a way to count lines, I assumed you would agree with
Python's definition as a reasonable way to count the
number of lines in a file and that your previous post
(on the behavior of wc) was meant more as a rhetorical
way to highlight the ambiguity than as an assertion of
general correctness.

I suspect somebody who asks the subject question wants to reproduce wc's
counting behavior.
Really? I was actually surprised at what wc does. I didn't
realize it only did a "\n" character count. The other programs
I know of number based on the start of line rather than end
of line.
% echo -n "blah" > blah.txt
% less blah.txt
(then press "=")
blah.txt lines 1-1/1 byte 4/4 (END) (press RETURN)
% echo -n "" | perl -ne '$line++; END{$line+=0;print "$line\n"}'
0
% echo -n "blah" | perl -ne '$line++; END{$line+=0;print "$line\n"}'
1

% echo -n "" | awk 'END {print NR}'
0
% echo -n "blah" | awk 'END {print NR}'
1

% echo -n "blah" | grep -n "blah"
1:blah
Of course, it _is_ an imprecise spec they're giving.


Yup.

Andrew
da***@dalkescientific.com
Jul 18 '05 #22
Andrew Dalke <ad****@mindspring.com> wrote:
I suspect somebody who asks the subject question wants to reproduce wc's
counting behavior.


Really? I was actually surprised at what wc does. I didn't
realize it only did a "\n" character count. The other programs


Ah well -- maybe it's just me, 25+ years of either using Unix or pining
for it (when I had to use VMS, VM/SP, Windows, etc, etc) must have left
their mark.
Alex
Jul 18 '05 #23

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: JD | last post by:
Hi guys I'm trying to write a program that counts the occurrences of HTML tags in a text file. This is what I have so far: #include <stdio.h> #include <stdlib.h> #include <string.h> ...
14
by: mesterak | last post by:
I want to very quickly count the number of lines in text files without having to read each line and increment a counter. I am working in VB.NET and C#. Does anyone have a very fast example on how...
68
by: Martin Joergensen | last post by:
Hi, I have some files which has the following content: 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0
3
by: javanooby | last post by:
Hi, I am having problems with this bit of code: public class main { public class readAccounts { reader1 r = new reader1();
8
by: shivam001 | last post by:
I have the following file as the input APPLE 0 118 1 110 1 125 1 135 2 110 3 107 3 115 3 126 ORANGE 0 112 1 119 2 109 2 119 3 112 4 109 4 128 MANGO 0 136 1 143 2 143 3 143 4 136 BANANA 0 5 1...
3
by: waynejr25 | last post by:
can anyone help me add a function that will count the occurance of each word in an input file. here's the code i have so far it counts the number of characters, words, and lines but i need the...
16
by: lovecreatesbea... | last post by:
It takes mu so time to finish this C source code line count function. What do you think about it? / ******************************************************************************* * Function ...
9
by: NvrBst | last post by:
Whats the best way to count the lines? I'm using the following code at the moment: public long GetNumberOfLines(string fileName) { int buffSize = 65536; int streamSize = 65536; long...
3
by: frozz85 | last post by:
I need to search through the file and look for lines that begin with "From". I need to parse the From line and print out the second word for each From line and then also count the number of From...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.