Got this assignment due a few weeks later and since I am done with the up coming assignment, decided to try out the next one early rather than rush later. First part of it requires me to do a character count on a text document which i will have to use huffman coding to encode.
Below is my code that i created to count and display the characters and its frequency in the document. - import java.io.*;
-
import java.util.*;
-
-
public class SourceModel
-
{
-
-
-
public static void main(String [] Args)
-
{
-
-
int j;
-
String str;
-
-
Map<Integer, Integer> m = new HashMap<Integer, Integer>();
-
-
try
-
{
-
BufferedReader in = new BufferedReader(new FileReader("DecOfInd.txt"));
-
-
while ((str = in.readLine()) != null)
-
{
-
for(int i = 0; i < str.length(); i++)
-
{
-
-
if(!m.containsKey((int)str.charAt(i)))
-
{
-
m.put((int)str.charAt(i),1);
-
}
-
else
-
{
-
j = m.get((int)str.charAt(i));
-
j++;
-
-
m.put((int)str.charAt(i),j);
-
-
}
-
-
}
-
}
-
-
}
-
catch (IOException e)
-
{
-
-
}
-
-
System.out.println();
-
System.out.println(m.size() + " distinct letters:");
-
System.out.println(m);
-
-
int count = 0;
-
-
for(int i :m.keySet())
-
{
-
System.out.println(i + " = " + m.get(i));
-
count += m.get(i);
-
}
-
System.out.println("Total Number of Characters: "+count);
-
-
}
-
}
-
The code has no errors and from the first run, it looks like everything went ok with it displaying the characters and the count. But just to make sure I got it right, I used the unix command "wc" on the document and it seems I have missing characters or something.
Unix gives:
$ wc DecOfInd.txt
29 1369 8458 DecOfInd.txt
My program gives:
Total Number of Characters: 8429
Any kind soul would like to help me find out whats wrong with my code as I seem to have 29 missing characters.
Many thanks in advance.
Kenneth :)
12 3853
Why did you use Map<Integer, Integer> instead of Map<Character, Integer> ?
Why did you use Map<Integer, Integer> instead of Map<Character, Integer> ?
Was trying to use the Ascii code as the key.
Unix gives:
$ wc DecOfInd.txt
29 1369 8458 DecOfInd.txt
My program gives:
Total Number of Characters: 8429
Any kind soul would like to help me find out whats wrong with my code as I seem to have 29 missing characters.
Many thanks in advance.
Kenneth :)
wc also counts the end of line characters (there are 29 lines in your file). Your
method doesn't (it reads entire Strings and removes the \n characters), so
8429+29 == 8458
kind regards,
Jos
wc also counts the end of line characters (there are 29 lines in your file). Your
method doesn't (it reads entire Strings and removes the \n characters), so
8429+29 == 8458
kind regards,
Jos
Duh .
Duh .
<vigorously starts kicking dust towards the general direction of r035198x/>
:-P
kind regards,
Jos ;-)
@OP: and if you run your code on other OS-es, you can get even different results. On Windows for example, the line separator is made out of two characters: \r\n.
ah thanks for that info... hmmm will read() work instead since it doesn't read lines like readline()?
ah thanks for that info... hmmm will read() work instead since it doesn't read lines like readline()?
Yup, that's the way to go because you want to reproduce the original file content
after Huffman decompression so you should take care of every single byte when
you compress the stuff. readLine() is a nono here, or you may want to add
those end-of-line characters yourself to the Huffman tables/tree.
kind regards,
Jos
Yup, that's the way to go because you want to reproduce the original file content
after Huffman decompression so you should take care of every single byte when
you compress the stuff. readLine() is a nono here, or you may want to add
those end-of-line characters yourself to the Huffman tables/tree.
kind regards,
Jos
ok thanks for the heads up... will give it a try tmr... :)
With the advise here, I have modified my code to write to a file the fequency of each character in its ascii code. - import java.io.*;
-
import java.util.*;
-
-
public class SourceModel
-
{
-
-
-
public static void main(String [] Args)
-
{
-
-
Map<Integer,Integer> m = new HashMap<Integer,Integer>();
-
-
String sourceFile = "DecOfInd.txt";
-
String outputFile = "DisProb_"+sourceFile;
-
-
String str = "";
-
-
try
-
{
-
-
File source = new File(sourceFile);
-
FileInputStream in = new FileInputStream(source);
-
-
-
int size = (int)source.length();
-
byte[] text = new byte[size];
-
-
System.out.println("The size of the file is "+source.length());
-
-
int b = in.read(text);
-
int count;
-
-
for (int i = 0; i < size ; i++)
-
{
-
if(m.containsKey((int)text[i]))
-
{
-
count = m.get((int)text[i]);
-
count++;
-
m.put((int)text[i],count);
-
}
-
else
-
{
-
m.put((int)text[i],1);
-
}
-
}
-
-
in.close();
-
}
-
-
catch (IOException e)
-
{
-
-
}
-
-
// Writting Character Fequency To File
-
-
try
-
{
-
BufferedWriter out = new BufferedWriter(new FileWriter(outputFile));
-
-
int k = 0;
-
-
for(int i : m.keySet())
-
{
-
k += m.get(i);
-
out.write(i + " = " + m.get(i) + '\n');
-
}
-
-
out.write("Total Number of Characters: "+k+'\n');
-
-
out.close();
-
}
-
catch (IOException e)
-
{
-
-
}
-
-
-
System.out.println();
-
System.out.println(m.size() + " distinct letters:");
-
System.out.println(m);
-
-
int j = 0; int feq;
-
-
for(int i : m.keySet())
-
{
-
j += m.get(i);
-
System.out.println(i + " = " + m.get(i));
-
}
-
-
System.out.println("Total Number of Characters: "+j);
-
-
}
-
}
While it runs fine on my machine(win xp), it seems to be giving me problems over at the school's server... -
$ javac -classpath . SourceModel.java
-
SourceModel.java:11: not a statement
-
Map<Integer,Integer> m = new HashMap<Integer,Integer>();
-
^
-
SourceModel.java:11: ';' expected
-
Map<Integer,Integer> m = new HashMap<Integer,Integer>();
-
^
-
SourceModel.java:63: ';' expected
-
for(int i : m.keySet())
-
^
-
SourceModel.java:72: illegal start of expression
-
}
-
^
-
SourceModel.java:85: ';' expected
-
for(int i : m.keySet())
-
^
-
SourceModel.java:93: illegal start of expression
-
}
-
^
-
SourceModel.java:91: ';' expected
-
System.out.println("Total Number of Characters: "+j);
-
^
-
7 errors
Is there something I need to do to make it unix friendly?
While it runs fine on my machine(win xp), it seems to be giving me problems over at the school's server...
Is there something I need to do to make it unix friendly?
It has nothing to do with Unix or not: your school machine is running Java 1.4
or earlier; your home machine is running Java 1.5 or later.
kind regards,
Jos
It has nothing to do with Unix or not: your school machine is running Java 1.4
or earlier; your home machine is running Java 1.5 or later.
kind regards,
Jos
Thanks for the quick reply... time to email my lecturer...
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Clifford W. Racz |
last post by:
Has anyone solved the issue of translating lists in Word 2003 (WordML)
into xHTML? I have been trying to get the nested table code for my XSLT
to work for a while now, with no way to get the...
|
by: Bill English |
last post by:
I want to add word count to my application. Consider the textcontrol as a regular Windows Forms Rich Text Box. How would I go about counting words?
|
by: slacker |
last post by:
I have to write a program that reads in a text file word by word and calculates how many times each word appears in the file.
* I have to use dynamic array of pointers to structures to accumulate...
|
by: MP |
last post by:
vb6,ado,mdb,win2k
i pass the sql string to the .Execute method on the open connection to
Table_Name(const) db table
fwiw
(the connection opened via class wrapper:)
msConnString = "Data Source="...
|
by: vmoreau |
last post by:
I have a text and I need to find a Word that are not enclosed in
paranthesis. Can it be done with a regex? Is someone could help me?
I am not familar with regex...
Example looking for WORD:...
|
by: etuncer |
last post by:
Hello All,
I have Access 2003, and am trying to build a database for my small
company. I want to be able to create a word document based on the data
entered through a form. the real question is...
|
by: AcidBurn |
last post by:
Hi
Can anyone help me ragarding my query
Im trying to develop a classic asp script that will count the number of lines of ms word document.
say...this is the content of the word file:
...
|
by: beanie |
last post by:
i am a c programming beginner and i am trying to Create a concordance of Word Count for a text File in c programming but my code isn't working.please can u help me out.here is my code:
#include...
|
by: beanie |
last post by:
i am a beginer in c programming and i am trying to Create a Concordance of Word Count for a Text File but my code is not working.pls can anyone helpme out.here is my code:
#include <stdio.h>...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |