473,387 Members | 1,535 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

writing to a file

as i understand there are two ways to write data to a file: using
f.write("foo") and print >>f, "foo".
what i want to know is which one is faster (if there is any difference
in speed) since i'm working with very large files. of course, if there
is any other way to write data to a file, i'd love to hear about it

May 30 '07 #1
5 5469

montyphy...@gmail.com wrote:
as i understand there are two ways to write data to a file: using
f.write("foo") and print >>f, "foo".
well print will add a '\n' or ' ' if you use ',' after it
what i want to know is which one is faster (if there is any difference
there shouldn't be any noticable difference
in speed) since i'm working with very large files. of course, if there
is any other way to write data to a file, i'd love to hear about it
other ways:
os.system('cat file1 >file2')
or subprocess.Popen
or print but sys.stdout = f
or ctypes + printf/fputs/..

and probably there are other obscure ways, but the intended way is
obviously f.write

nsz

May 30 '07 #2
mo*********@gmail.com schrieb:
as i understand there are two ways to write data to a file: using
f.write("foo") and print >>f, "foo".
what i want to know is which one is faster (if there is any difference
in speed) since i'm working with very large files. of course, if there
is any other way to write data to a file, i'd love to hear about it
You should look at the mmap-module.

Diez
May 30 '07 #3
On May 30, 1:41 pm, "Diez B. Roggisch" <d...@nospam.web.dewrote:
montyphy...@gmail.com schrieb:
what i want to know is which one is faster (if there is any difference
in speed) since i'm working with very large files. of course, if there
is any other way to write data to a file, i'd love to hear about it

You should look at the mmap-module.

Yes, memory mappings can be more efficient than files accessed using
file descriptors. But mmap does not take an offset parameter, and is
therefore not suited for working with large files. For example you
only have a virtual memory space of 4 GiB on a 32 bit system, so there
is no way mmap can access the last 4 GiB of an 8 GiB file on a 32 bit
system. If mmap took an offset parameter, this would not be a problem.

However, numpy has a properly working memory mapped array class,
numpy.memmap. It can be used for fast file access. Numpy also has a
wide range of datatypes that are efficient for working with binary
data (e.g. an uint8 type for bytes), and a record array for working
with structured binary data. This makes numpy very attractive when
working with binary data files.

Get the latest numpy here: www.scipy.org.

Let us say you want to memory map an 23 bit RGB image of 640 x 480
pixels, located at an offset of 4096 bytes into the file 'myfile.dat'.
Here is how numpy could do it:

import numpy

byte = numpy.uint8
desc = numpy.dtype({'names':['r','g','b'],'formats':[byte,byte,byte]})
mm = numpy.memmap('myfile.dat', dtype=desc, offset=4096,
shape=(480,640), order='C')
red = mm['r']
green = mm['g']
blue = mm['b']

Now you can access the RGB values simply by slicing the arrays red,
green, and blue. To set the R value of every other horizontal line to
0, you could simply write

red[::2,:] = 0

As always when working with memory mapped files, the changes are not
committed before the memory mapping is synchronized with the file
system. Thus, call

mm.sync()

when you want the actual write process to start.

The memory mapping will be closed when it is garbage collected
(typically when the reference count falls to zero) or when you call
mm.close().










May 30 '07 #4
On May 30, 4:53 pm, sturlamolden <sturlamol...@yahoo.nowrote:
import numpy

byte = numpy.uint8
desc = numpy.dtype({'names':['r','g','b'],'formats':[byte,byte,byte]})
mm = numpy.memmap('myfile.dat', dtype=desc, offset=4096,
shape=(480,640), order='C')
red = mm['r']
green = mm['g']
blue = mm['b']
An other thing you may commonly want to do is coverting between numpy
uint8 arrays and raw strings. This is done using the methods
numpy.fromstring and numpy.tostring.

# reading from file to raw string
rstr = mm.tostring()

# writing raw string to file
mm[:] = numpy.fromstring(rstr, dtype=numpy.uint8)
mm.sync()




May 30 '07 #5
On May 30, 4:53 pm, sturlamolden <sturlamol...@yahoo.nowrote:
However, numpy has a properly working memory mapped array class,
numpy.memmap.
It seems that NumPy's memmap uses a buffer from mmap, which makes both
of them defunct for large files. Damn.

mmap must be fixed.


May 30 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

48
by: Joseph | last post by:
Hi I'm writing a commercial program which must be reliable. It has to do some basic reading and writing to and from files on the hard disk, and also to a floppy. I have foreseen a potential...
6
by: Sebastian Kemi | last post by:
How should a write a class to a file? Would this example work: object *myobject = 0; tfile.write(reinterpret_cast<char *>(myobject), sizeof(*object)); / sebek
3
by: ishekar | last post by:
Hi, I have an application where i want to write data to a file, the data is being sent from an external source. I know the total size of the data and then i retrieve the data in small segments...
1
by: Daniel | last post by:
System.IO.StreamWriter Close or Flush method to shut down the computer in such a way that just part of the file is written? or an empty file is written? Also if the Close or Flush is to a...
2
by: melanieab | last post by:
Hi, I'm trying to store all of my data into one file (there're about 140 things to keep track of). I have no problem reading a specific string from the array file, but I wasn't sure how to...
4
by: HNguyen | last post by:
Hi, I have a Web application in ASP.NET. My Application allows the users upload files into the server after checking their user names and passwords. For each transaction, the Web program will...
0
by: Yunus's Group | last post by:
Yunus's Group May 23, 3:36 pm show options Newsgroups: microsoft.public.dotnet.languages.vb From: "Yunus's Group" <yunusasm...@gmail.com> - Find messages by this author Date: 23 May 2005...
16
by: Claudio Grondi | last post by:
I have a 250 Gbyte file (occupies the whole hard drive space) and want to change only eight bytes in this file at a given offset of appr. 200 Gbyte (all other data in that file should remain...
6
by: arne.muller | last post by:
Hello, I've come across some problems reading strucutres from binary files. Basically I've some strutures typedef struct { int i; double x; int n; double *mz;
3
by: Barry Flynn | last post by:
Hi I am working with a VB 2005 program which has been converted from VB6. It writes data out to a flat file, with code like the following line WriteLine(riFileNo, "Hist", lsAssetID,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.