473,538 Members | 2,749 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

read web page

I need to read a web a page and do a search on the page and gather
information and put it into a text file.
the web page is setup into a table, and displays information on files stuck
in a queue. What I need to do is read the page (without the user seeing it
if possible) search for certain words and get that row of data.

example:
table looks like this:
docNumber FileName
---------------------------------
1234 <docname>text.xml</docname>
I would need to search the page for text.xml and grab the docNumber and put
that into the text file.
Is this possible to do? If so any suggestions on how to do it?
thx
Nov 16 '05 #1
4 7780
I'm not sure if I understood your requirements correctly. It might be
worthwhile to check out the following classes in MSDN

HttpWebRequest
HttpWebResponse

Basically, you have to read the web page by an Http Request.

In order to parse the page, you need an HTML parser.. You have the .Net XML
Parser which can be cleverly made to read HTML pages too.

-vJ

"Mike" <ba*******@sbcglobal.net> wrote in message
news:Or*************@TK2MSFTNGP11.phx.gbl...
I need to read a web a page and do a search on the page and gather
information and put it into a text file.
the web page is setup into a table, and displays information on files
stuck
in a queue. What I need to do is read the page (without the user seeing it
if possible) search for certain words and get that row of data.

example:
table looks like this:
docNumber FileName
---------------------------------
1234 <docname>text.xml</docname>
I would need to search the page for text.xml and grab the docNumber and
put
that into the text file.
Is this possible to do? If so any suggestions on how to do it?
thx

Nov 16 '05 #2
You can do this by using the System.Web.WebClient object.

http://msdn.microsoft.com/library/de...sctorTopic.asp

has an example of how to use it. you can even parse that string in to
a XmlTextReader and it will allow to you to traverse nodes as if it
were a xml document.

hope this helps

"Mike" <ba*******@sbcglobal.net> wrote in message news:<Or*************@TK2MSFTNGP11.phx.gbl>...
I need to read a web a page and do a search on the page and gather
information and put it into a text file.
the web page is setup into a table, and displays information on files stuck
in a queue. What I need to do is read the page (without the user seeing it
if possible) search for certain words and get that row of data.

example:
table looks like this:
docNumber FileName
---------------------------------
1234 <docname>text.xml</docname>
I would need to search the page for text.xml and grab the docNumber and put
that into the text file.
Is this possible to do? If so any suggestions on how to do it?
thx

Nov 16 '05 #3
Ben Dewey wrote:
You can do this by using the System.Web.WebClient object.

http://msdn.microsoft.com/library/de...sctorTopic.asp
has an example of how to use it. you can even parse that string in to
a XmlTextReader and it will allow to you to traverse nodes as if it
were a xml document.


Only if it is XHTML.

Cheers,

--
Joerg Jooss
jo*********@gmx.net

Nov 16 '05 #4
If it is just a standard HTML page you can just search it as a string.

I attached some code that I have used in the past to get data from a
webpage. In this example I was just searching the _download string for
certain keywords.

/// <summary>
/// Defines the different connection types available to retrieving whois
data.
/// </summary>
public enum CONNECTION_TYPE
{
/// <summary>Uses GET as the method for sending data to the Server</summary>
GET,
/// <summary>Uses POST as the method for sending data to the
Server</summary>
POST
}
protected void Connect()
{
string serverUrl = null, postData = null;
byte[] myDataBuffer = null;
WebClient httpClient = new WebClient();
try
{
// Generate Post Data
postData = HttpUtility.UrlEncode( _additionalPostData ) );
// Connect to the Server.
switch ( _connectionType )
{
case CONNECTION_TYPE.GET:
serverUrl = _server + "?" + postData;
myDataBuffer = httpClient.DownloadData(serverUrl);
break;
case CONNECTION_TYPE.POST:
serverUrl = _server;
httpClient.Headers.Add("Content-Type","application/x-www-form-urlencoded");
byte[] postArray = Encoding.ASCII.GetBytes( postData );
myDataBuffer = httpClient.UploadData(serverUrl,"POST",postArray);
break;
}
_download = Encoding.ASCII.GetString(myDataBuffer);
_isConnected = true;
}
catch (Exception e)
{
_isConnected = false;
throw (e);
}
}
"Joerg Jooss" <jo*********@gmx.net> wrote in message
news:Oi*************@tk2msftngp13.phx.gbl...
Ben Dewey wrote:
You can do this by using the System.Web.WebClient object.

http://msdn.microsoft.com/library/de...sctorTopic.asp

has an example of how to use it. you can even parse that string in to
a XmlTextReader and it will allow to you to traverse nodes as if it
were a xml document.


Only if it is XHTML.

Cheers,

--
Joerg Jooss
jo*********@gmx.net

Nov 16 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
8989
by: Sebastian Krause | last post by:
Hello, I tried to read in some large ascii files (200MB-2GB) in Python using scipy.io.read_array, but it did not work as I expected. The whole idea was to find a fast Python routine to read in arbitrary ascii files, to replace Yorick (which I use right now and which is really fast, but not as general as Python). The problem with...
2
3070
by: Sandman | last post by:
Just looking for suggestion on how to do this in my Web application. The goal is to keep track of what a user has and hasn't read and present him or her with new material I am currently doing this by aggregating new content from all databases into a single indexed database and then saving a timestamp in the account database (for the...
19
2682
by: Holger Hasselbach | last post by:
- The value of the object allocated by the malloc function is used (7.20.3.3). - The value of any bytes in a new object allocated by the realloc function beyond the size of the old object are used (7.20.3.4). Something like this (include and checkings omitted): p = malloc(sizeof(*p) * 5); p = 1; p = 1; p = 1;
5
7729
by: Hareth | last post by:
How can I read lines from an XML file. & add them to a counter. If possible, words between <page> <\page> is considered as +1 to my counter. if it isnt possible, I would atlest like to read until the word "explorer" then add it (+1) to my counter, then read next word, and so on. Example File:
2
1769
by: Julie | last post by:
I'm wondering what standard/best practices are for the following scenario: I have a page representing several fields for stored data. If the user has sufficient access privileges, the page is rendered w/ read/write text boxes and they can edit the various fields as needed. However, if the user doesn't have sufficient privileges (i.e. view...
2
2059
by: Alan | last post by:
Hello, I am making a web page where I need to show somewhere in the page a webcam without component like activex. In my webcam I can read the last picture with the following command: http://192.168.1.34/jpg/image.jpg I have a img tag in my web page where I show the last picture. But I don't want to refresh the web page all the time to show...
4
11042
by: Francois Stander | last post by:
Hi, hope someone can help me. It seems imposible to read data from a server, however, I can read the validation data from the server and hold it in dataviews . datasets or data tables in my asp "load" page. I would like to use these data (which I presume is in some form of html or xml) in my HTML portion of the asp page. The data must be read...
4
4185
by: Kuldeep | last post by:
Hi All, I am trying to read the contents of a page through its URL. My code snippet is as follows: public void mtdGetPageDataHWR() { HttpWebRequest objRequ = (HttpWebRequest)WebRequest.Create("http://www.microsoft.com"); HttpWebResponse objResp = (HttpWebResponse)objRequ.GetResponse();
9
4258
by: =?Utf-8?B?TWlrZQ==?= | last post by:
Hi. Is it programatically possible in VB.NET to read the contents of web.config's <customErrorselement? I looked at using ConfigurationSettings.AppSettings, but that doesn't work. I need to read the value of redirect from the error statusCode 404. My web.config looks like this: <customErrors mode="RemoteOnly"...
0
7365
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7301
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7688
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7644
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
4847
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3345
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1761
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
930
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
588
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.