473,326 Members | 2,012 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

how to convert gpr file to csv format: using python

4
Hi
I am a beguinner, I would like to known how to convert a file in gpr format to csv format by using python.
Baber
Jan 11 '07 #1
16 7432
bartonc
6,596 Expert 4TB
Hi
I am a beguinner, I would like to known how to convert a file in gpr format to csv format by using python.
Baber
Very well. Let's move this to the Python forum. Welcome to TSDN.
Jan 11 '07 #2
bartonc
6,596 Expert 4TB
Hi
I am a beguinner, I would like to known how to convert a file in gpr format to csv format by using python.
Baber
Welcome to the Python Forum on TheScipts.com.
I don't recognize gpr. Is it some other text format or from a program?
Jan 11 '07 #3
ghostdog74
511 Expert 256MB
well, you should help use to help you, by providing an example of gpr format, and your expected output, in which case, you are requiring csv.
looking up the gpr extension, i can only find that it relates to some modeling software system...
Jan 12 '07 #4
bartonc
6,596 Expert 4TB
well, you should help use to help you, by providing an example of gpr format, and your expected output, in which case, you are requiring csv.
looking up the gpr extension, i can only find that it relates to some modeling software system...
Hey ghostdog! Where you been so long?
I actually found the GenePix Results format, but don't know if this is the correct one:
Expand|Select|Wrap|Line Numbers
  1. GPR Header
  2. A sample GPR file header and a description of each entry are shown below: 
  3.  
  4. Entry Description 
  5.  
  6. ATF     1.0 File type and version number. 
  7. 29       48 Number of optional header records and
  8. number of data fields (columns). 
  9. "Type=GenePix Results 3" Type of ATF file. 
  10. "DateTime=2002/02/09 17:15:48" Date and time when the image was acquired. 
  11. "Settings=C:\Genepix\Genepix.gps" The name of the settings file that was used for analysis. 
  12. "GalFile=C:\Genepix\Demo.gal" The GenePix Array List file used to associate Names and IDs to each entry. 
  13. "PixelSize=10" Resolution of each pixel in µm. 
  14. "Wavelengths=635     532" Installed laser excitation sources in nm. 
  15. "ImageFiles=C:\Genepix\demo.tif 0
  16. C:\Genepix\Genepix.tif 1" The name and path of the associated TIF file(s). 
  17. "NormalizationMethod=None" The type of normalization method used, if applicable. 
  18. "NormalizationFactors=1    1" The normalization factor applied to each channel. 
  19. "JpegImage=C:\Genepix\demo.jpg" The name and path of the associated Jpeg image files. 
  20. "StdDev=Type 1" The type of standard deviation calculation selected in the Options settings. 
  21. "RatioFormulation=W1/W2 (635/532)" The ratio formulation of the ratio image, showing which image is numerator and which is denominator. 
  22. "Barcode=00331" The barcode symbols read from the image. 
  23. "BackgroundSubtraction=LocalFeature" The background subtraction method selected in the Options settings. 
  24. "ImageOrigin=0, 0" The origin of the image relative to the scan area. 
  25. "JpegOrigin=390, 4320" The origin of the Results JPEG image (the bounding box of the analysis Blocks) relative to the scan area origin. 
  26. "Creator=GenePix 4.1.1.4" The version of the GenePix Pro software used to create the Results file. 
  27. "Scanner=GenePix 4000B [serial number]" Type and serial number of scanner used to acquire the image. 
  28. "FocusPosition=0" The focus position setting used to acquire the image, in microns. 
  29. "Temperature=19.6127" The temperature of the scanner, in degrees C. 
  30. "LinesAveraged=1" The line average setting used to acquire the image. 
  31. "Comment=hyb 2673" User-entered file comment. 
  32. "PMTGain=500     600" The PMT settings during acquisition. 
  33. "ScanPower=100    100" The amount of laser transmission during acquisition. 
  34. "LaserPower=1    1" The power of each laser, in volts. 
  35. "LaserOnTime=5    5" The laser on-time for each laser, in minutes. 
  36. "Filters=<Empty>    <Empty>" Emission filters used during acquisition (GenePix 4100 and 4200 only.) 
  37. "ScanRegion=100,100,2000,2000" The coordinate values of the scan region used during acquisition, in pixels. 
  38. "Supplier=" Header field supplied in GAL file. 
  39. Data record column headings Column titles for each measurement (see below). 
  40. Data Records Extracted data. 
  41.  
  42.  
  43.  
  44.  
  45.  
  46. GPR Data
  47. The list below describes each column of data in the Results file. 
  48.  
  49. Column Title Description 
  50.  
  51. Block the block number of the feature. 
  52. Column the column number of the feature. 
  53. Row the row number of the feature. 
  54. Name the name of the feature derived from the Array List (up to 40 characters long, contained in quotation marks). 
  55. ID the unique identifier of the feature derived from the Array List (up to 40 characters long, contained in quotation marks). 
  56. X the X-coordinate in µm of the center of the feature-indicator associated with the feature, where (0,0) is the top left of the image. 
  57. Y the Y-coordinate in µm of the center of the feature-indicator associated with the feature, where (0,0) is the top left of the image. 
  58. Dia. the diameter in µm of the feature-indicator. 
  59. F635 Median median feature pixel intensity at wavelength #1 (635 nm). 
  60. F635 Mean mean feature pixel intensity at wavelength #1 (635 nm). 
  61. F635 SD the standard deviation of the feature pixel intensity at wavelength #1 (635 nm). 
  62. B635 Median the median feature background intensity at wavelength #1 (635 nm). 
  63. B635 Mean the mean feature background intensity at wavelength #1 (635 nm). 
  64. B635 SD the standard deviation of the feature background intensity at wavelength #1 (635 nm). 
  65. % > B635 + 1 SD the percentage of feature pixels with intensities more than one standard deviation above the background pixel intensity, at wavelength #1 (635 nm). 
  66. % > B635 + 2 SD the percentage of feature pixels with intensities more than two standard deviations above the background pixel intensity, at wavelength #1 (635 nm). 
  67. F635 % Sat. the percentage of feature pixels at wavelength #1 that are saturated. 
  68. F532 Median median feature pixel intensity at wavelength #2 (532 nm). 
  69. F532 Mean mean feature pixel intensity at wavelength #2 (532 nm). 
  70. F532 SD the standard deviation of the feature intensity at wavelength #2 (532 nm). 
  71. B532 Median the median feature background intensity at wavelength #2 (532 nm). 
  72. B532 Mean the mean feature background intensity at wavelength #2 (532 nm). 
  73. B532 SD the standard deviation of the feature background intensity at wavelength #2 (532 nm). 
  74. % > B532 + 1 SD the percentage of feature pixels with intensities more than one standard deviation above the background pixel intensity, at wavelength #2 (532 nm). 
  75. % > B532 + 2 SD the percentage of feature pixels with intensities more than two standard deviations above the background pixel intensity, at wavelength #2 (532 nm). 
  76. F532 % Sat. the percentage of feature pixels at wavelength #2 that are saturated. 
  77. Ratio of Medians the ratio of the median intensities of each feature for each wavelength, with the median background subtracted. 
  78. Ratio of Means the ratio of the arithmetic mean intensities of each feature for each wavelength, with the median background subtracted. 
  79. Median of Ratios the median of pixel-by-pixel ratios of pixel intensities, with the median background subtracted. 
  80. Mean of Ratios the geometric mean of the pixel-by-pixel ratios of pixel intensities, with the median background subtracted. 
  81. Ratios SD the geometric standard deviation of the pixel intensity ratios. 
  82. Rgn Ratio the regression ratio of every pixel in a 2-feature-diameter circle around the center of the feature. 
  83. Rgn R² the coefficient of determination for the current regression value. 
  84. F Pixels the total number of feature pixels. 
  85. B Pixels the total number of background pixels. 
  86. Sum of Medians the sum of the median intensities for each wavelength, with the median background subtracted. 
  87. Sum of Means the sum of the arithmetic mean intensities for each wavelength, with the median background subtracted. 
  88. Log Ratio log (base 2) transform of the ratio of the medians. 
  89. Flags the type of flag associated with a feature. 
  90. Normalize the normalization status of the feature (included/not included). 
  91. F1 Median - B1 the median feature pixel intensity at wavelength #1 with the median background subtracted. 
  92. F2 Median - B2 the median feature pixel intensity at wavelength #2 with the median background subtracted. 
  93. F1 Mean - B1  the mean feature pixel intensity at wavelength #1 with the median background subtracted. 
  94. F2 Mean - B2 the mean feature pixel intensity at wavelength #2 with the median background subtracted. 
  95. SNR 1 the signal-to-noise ratio at wavelength #1, defined by (Mean Foreground 1- Mean Background 1) / (Standard deviation of Background 1) 
  96. F1 Total Intensity the sum of feature pixel intensities at wavelength #1 
  97. Index the number of the feature as it occurs on the array. 
  98. "User Defined" user-defined feature data read from the GAL file (GenePix Pro 4.1). 
Jan 12 '07 #5
ghostdog74
511 Expert 256MB
hey barton
i've been lurking around :-)...
anyway, thanks for the gpr format. if its correct, then now its up to OP to specify his requirements. :)
Jan 12 '07 #6
baber
4
hey barton
i've been lurking around :-)...
anyway, thanks for the gpr format. if its correct, then now its up to OP to specify his requirements. :)
This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?
Jan 16 '07 #7
bvdet
2,851 Expert Mod 2GB
This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?
If I understand this format correctly, it is a tab delimited file. The script below will replace each tab with a comma and output to another file:
Expand|Select|Wrap|Line Numbers
  1. import os
  2.  
  3. def tab_to_csv(tab_name, csv_name):
  4.     try:
  5.         f1 = open(tab_name, 'r')
  6.         f2 = open(csv_name, 'w')
  7.         outList = []
  8.         for line in f1:
  9.             outList.append(line.replace('\t', ','))
  10.         f1.close()
  11.         f2.writelines(outList)
  12.         f2.close()
  13.         return True
  14.     except:
  15.         return False
  16.  
  17. if __name__ == '__main__':
  18.  
  19.     def run_script():
  20.  
  21.         gpr_file = (os.path.join('H:\\', 'TEMP', 'temsys', 'GPR.gpr'))
  22.         csv_file = (os.path.join('H:\\', 'TEMP', 'temsys', 'GPR.txt'))
  23.         if tab_to_csv(gpr_file, csv_file):
  24.             print 'Tab delimited file conversion to comma delimited file was successful'
  25.         else:
  26.             print 'There was an error'
  27.     run_script()
Jan 16 '07 #8
bvdet
2,851 Expert Mod 2GB
Here's some more information I found on the gpr format:
ATF - Axon Text File format (*.atf)

ATF is a tab-delimited text file format that can be read by typical spreadsheet programs such as Microsoft Excel. It is used for GenePix Array List (GAL) files, and GenePix Results (GPR) files.

An ATF text file consists of records. Each line in the text file is a record. Each record may consist of several fields, separated by a field separator (column delimiter). The tab and comma characters are field separators. Space characters around a tab or comma are ignored and considered part of the field separator. Text strings are enclosed in quotation marks to ensure that any embedded spaces, commas and tabs are not mistaken for field separators.

The group of records at the beginning of the file is called the file header. The file header describes the file structure and includes column titles, units, and comments.
It would be great if baber could provide us with a sample gpr file so we could test it.
Jan 16 '07 #9
dshimer
136 Expert 100+
1) This looks like a very straightforward text file in which you could read in all the lines, create a list of each line, evaluate the list based on their contents the just write it back out delimited by commas.

That said, I'll admit I'm still a bit confused by the format. Does this imply that each line "line 1" etc, is comprised of a bunch of data organized in columns? Or that there are N lines containing something, then a string of n entries of "col" data, followed by further strings of value data? In any case I can think of several ways to easily read and analyze the data, I just am not totally clear on what is being described.

This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?
Jan 16 '07 #10
bartonc
6,596 Expert 4TB
This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?
So this IS GenePix, right?
Jan 17 '07 #11
ghostdog74
511 Expert 256MB
This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?
i don't really know what is your desired output, but by specifying csv, i guessed you just want a comma separated. Here's a bit of code
Expand|Select|Wrap|Line Numbers
  1. import fileinput
  2. for line in fileinput.FileInput("file",inplace=1):
  3.    print ','.join(line.split())
  4. >>>
  5.  

output:
Expand|Select|Wrap|Line Numbers
  1. line,1
  2. line,2
  3. Line,n
  4.  
  5. col1,col2,.....,coln
  6. line1,val1,val2,valn
  7. line2,etc,etc
  8. line3,etc
  9.  
  10.  
Jan 17 '07 #12
bvdet
2,851 Expert Mod 2GB
i don't really know what is your desired output, but by specifying csv, i guessed you just want a comma separated. Here's a bit of code
Expand|Select|Wrap|Line Numbers
  1. import fileinput
  2. for line in fileinput.FileInput("file",inplace=1):
  3.    print ','.join(line.split())
  4. >>>
  5.  

output:
Expand|Select|Wrap|Line Numbers
  1. line,1
  2. line,2
  3. Line,n
  4.  
  5. col1,col2,.....,coln
  6. line1,val1,val2,valn
  7. line2,etc,etc
  8. line3,etc
  9.  
  10.  
It works except as indicated below. Before:
Expand|Select|Wrap|Line Numbers
  1. ATF    1            
  2. 8    5            
  3. Type=GenePix ArrayList V1.0                
  4. BlockCount=4                
  5. BlockType=0                
  6. URL=http://genome-www.stanford.edu/cgi-bin/dbrun/SacchDB?find+Locus+%22[ID]%22                
  7. "Block1= 400, 400, 100, 24, 175, 5, 175"                
  8. "Block2= 4896, 400, 100, 24, 175, 5, 175"                
  9. "Block3= 400, 4896, 100, 24, 175, 5, 175"                
  10. "Block4= 4896, 4896, 100, 24, 175, 5, 175"                
  11. Block    Column    Row    Name    ID
  12. 1    1    1    VPS8    YAL002W
  13. 1    2    1    NTG1    YAL015C
After:
Expand|Select|Wrap|Line Numbers
  1. ATF,1
  2. 8,5
  3. Type=GenePix ArrayList V1.0
  4. BlockCount=4
  5. BlockType=0
  6. URL=http://genome-www.stanford.edu/cgi-bin/dbrun/SacchDB?find+Locus+%22[ID]%22
  7. "Block1= 400, 400, 100, 24, 175, 5, 175"
  8. "Block2= 4896, 400, 100, 24, 175, 5, 175"
  9. "Block3= 400, 4896, 100, 24, 175, 5, 175"
  10. "Block4= 4896, 4896, 100, 24, 175, 5, 175"
  11. Block,Column,Row,Name,ID
  12. 1,1,1,VPS8,YAL002W
  13. 1,2,1,NTG1,YAL015C
To prevent duplicate commas at embedded spaces, strip trailing tab and newline characters and split on tabs:
Expand|Select|Wrap|Line Numbers
  1. for line in fileinput.input(gpr_file, True, '.bak'):
  2.    print ','.join(line.rstrip('\t\n').split('\t'))
Good post ghostdog. I did not know about fileinput.
Jan 17 '07 #13
baber
4
Thanks a lot, now I can convert .gpr to .csv.

Baber
Jan 22 '07 #14
bartonc
6,596 Expert 4TB
Thanks a lot, now I can convert .gpr to .csv.

Baber
Awesome! Thanks for the update.
Jan 23 '07 #15
well, you should help use to help you, by providing an example of gpr format, and your expected output, in which case, you are requiring csv.
looking up the gpr extension, i can only find that it relates to some modeling software system...
hi friends
i want to know how to get .gpr file (microarray data files) and how to run the file in matlab....
plz help me as soon as possible.i need if for my project...........
Mar 20 '07 #16
bvdet
2,851 Expert Mod 2GB
hi friends
i want to know how to get .gpr file (microarray data files) and how to run the file in matlab....
plz help me as soon as possible.i need if for my project...........
Hello vijayachitra,

I don't know how to get GPR files. You can probably find some sample files on the internet. You have not given us enough information about what data you need to parse from a GPR file. Since you have found this thread, you can see that information can easily be extracted, but what information and in what format? How about this from our example:
Expand|Select|Wrap|Line Numbers
  1. import re
  2. def readBlockData(fn):
  3.     dd = {}
  4.     fList = open(fn).readlines()
  5.     for line in fList:
  6.         line = line.strip('"\n\t')
  7.         if re.match('Block\d', line):
  8.             tem = line.split('=')
  9.             dd[tem[0]] = [int(i) for i in tem[1].strip().split(', ')]
  10.     return dd           
  11.  
  12. if __name__ == '__main__':
  13.  
  14.     dd = readBlockData('your_file))
  15.     for key in dd:
  16.         print '%s = %s' % (key, dd[key])
  17.  
  18. '''
  19. Block4 = [4896, 4896, 100, 24, 175, 5, 175]
  20. Block3 = [400, 4896, 100, 24, 175, 5, 175]
  21. Block2 = [4896, 400, 100, 24, 175, 5, 175]
  22. Block1 = [400, 400, 100, 24, 175, 5, 175]
  23. '''
Mar 21 '07 #17

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: Ben Kial | last post by:
Is there a Python program to convert Windows long filename, like "c:\Program Files" into the old DOS 8.3 format, like "c:\Progra~1"? Thanks in advance, Ben
3
by: GM | last post by:
Dear all, Could you all give me some guide on how to convert my big5 string to unicode using python? I already knew that I might use cjkcodecs or python 2.4 but I still don't have idea on what...
3
by: ravibantu | last post by:
Hi guys, I am a newbie to python. Does anybody know how i can read a dat file into python and convert into xcel format? Thank you ravi
29
by: Harlin Seritt | last post by:
Hi... I would like to take a string like 'supercalifragilisticexpialidocius' and write it to a file in binary forms -- this way a user cannot read the string in case they were try to open in...
2
by: Vamp4L | last post by:
Hello, Specifically, I'm trying to convert the Internet Explorer history file (index.dat) into a readable format. Anyone done something similar or know of any functions that may help with such a...
5
by: amit.uttam | last post by:
Hey everyone, I've recently jumped big time into python and I'm working on a software program for testing automation. I had a question about proper logging of output. What I would like is: 1....
0
by: Dinil Karun | last post by:
hi, I am using the below code but i am getting a error saying pyUno module not found. can u please help. Regards Dinil ...
5
by: sonu | last post by:
hey good morning ...... how to convert a video file in .flv format in php for linux hosting......is there any package whis provide this facility . Can i use ffmpeg for linux hosting...
0
by: ashz | last post by:
Hi All, I want to convert any document to tiff file format. Is it possible using dot net. Actually i have a simple button in my form that open a OpenFileDialog control. Using this i get the...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.