A Simple HTML File Browser

An extract from Rob's initial design notes

Design Criteria

It must be able to display diagrams and photographs in the text stream. Therefore it must be graphics-based. Decision: make it a Windows program. It will flow text so as to fit the current width of the window.

However, it must provide for the <listing> and <code> tags which display text 'as is'. This may mean that text will have to be right-chopped. In this case it must provide for horizontal scrolling. This is also necessary for wide images when they will not fit completely within the current window width.

Design Questions

Do I (1) make it translate the entire HTML file into a graphics map before display and then scroll the map? This will require a limit on galley width. Or, if I use the maximum galley width encountered in the file, it may mean using up absolutely vast amounts of memory to accommodate it.

Or do I (2) map only what is currently to be displayed? Then I can make the image as wide as the widest line within the field of display. The biggest problem here is scrolling backwards. How do I determine where the beginning of the previous line is in the file? Here, display speed could be a problem. However, it will require far less disk space to accommodate files.

Since I may need to store lots of HTML and GIF files on the disk, it is sensible to adopt choice (2). The program will therefore need the following.

HTML File Pointers

It must have two file pointers FP1 and FP2 which mark the beginning and end of the section of the HTML file stream currently being displayed.. These pointers will not necessarily always be the same distance apart for a given window size. This is because they will have to traverse non-displayable parts of the file - namely the HTML tags and their attributes. The positions of these two file pointers must therefore be computed separately and independently every time one of them is moved.

When a given HTML file is first displayed, FP1 remains at the start of the file while FP1 is advanced. As it advances, the HTML tags it encounters are expedited and the text is displayed. This continues until the window is full.

With <listing> and <code> tags, text flow may be interrupted as follows. As the window's right margin is reached at the end of each line of text, display is suppressed, but scanning of the text is continued until the full virtual galley width has been reached.

Scrolling

To move forwards in the file (ie down the page) FP2 is advanced until a complete new displayable line has been abstracted from the HTML file. FP1 is then advanced to the beginning of what was the second line of the display which is now the first. This is made easier if a file pointer FP3 is used always to keep the start position of the second row of the current display. Then all I have to do is move FP3 to FP1. Then, while re-displaying the window, put the start of the second line into FP3.

To move backwards is more difficult. First scan backwards from FP1 until enough text or graphics image has been re-abstracted to produce the previous complete line. The start of this line is the new FP1. The old value of FP1 having been put in FP3. FP2 is then moved back to the beginning of the previous line. It is therefore a good idea also to keep the next-to-last line pointer in FP4 for this purpose.

Another approach would be to scan the whole HTML file before it is displayed to find the positions of the beginning of each displayable line within the HTML file. These would have to be re-computed if the window were re-sized. However, a resizing event does not occur anywhere nearly as often as a scrolling event. Therefore, this could save a lot of time and complication for the scrolling operation. An array of file pointers FP[] could be defined into which the places where the displayable lines start in the HTML file are put during a pre-display scan of the whole file.

HTML files tend to be short as text files go. They are usually no more than a page of A4 - about 64 occupied lines. An array FP[1024] would comfortably accommodate files of 16 pages - about a chapter. However, this could be a dynamic array which is allocated at the time the file is scanned. I think this is the best approach.

Scrolling now becomes much simpler. The program simply moves up or down the array as required. It then abstracts the new line from the HTML file. It then copies the screen contents (pixel-by-pixel) up or down the by the number of pixels equivalent to 1 line of text. Finally, it lays in the new line. So to progress one line along the HTML file (ie down the page), the process is as shown below.

And to progress one line backwards along the HTML file (ie up the page), the process is as shown below.

There is also a need to be able to advance one whole window full at a time. This is done as follows:

The number of lines per window-full, n, varies with the current window height and with the point-size of the current font.. It must therefore be re-calculated whenever either of these is changed (which isn't very often). The pointers p1 and p2 are pointers to file-pointers. The base pointer p is also a pointer to a file-pointer and points to the start of the dynamic array of file pointers. To advance along the HTML file by one window-full, proceed as follows.

p1 += n; p2 += n;   //increment the two pointers by n lines.
Then, starting at point *p1 (ie from the byte position within the HTML file stream held in the array element pointed to by p1) in the HTML file, display the interpreted contents of the file. To go back, proceed as before except decrement p1 and p2 viz.
p1 -= n; p2 -= n;   //decrement the two pointers by n lines.

Hyper-Link Mapping

The positions of hyper-text within the window have to be registered. This is best done by an array of co-ordinates. The resolution required is that of a pixel. Remember that hyper-links can be hot-spots on geographic maps. A pair of 16-bit co-ordinates will therefore suffice.

What is the best way of expressing the positions of hyper-text within a window? The first consideration is that the test to see if the mouse pointer is within a hot-spot must be kept as fast and as simple as possible. It has to be done every time the mouse moves. The hot-spots must therefore be expressed in a form as close to mouse co-ordinates as possible. At the very least they should be in window-relative pixels. For each item of hyper-text you need to store the diagonal co-ordinates of the pixel box containing it, together with the URL with which it is associated.

Hyper-Link Data Structure

For each hyper-link on the screen, a data structure is needed to accommodate the above details. A suitable data structure is as follows.
struct hyperlink {    //define a hyperlink data structure
  int y1;             //Y-co-ordinate of top left of hot-box
  int y2;             //Y-co-ordinate of bottom right of hot-box
  int x1;             //X-co-ordinate of top right of hot box
  int x2;             //X-co-ordinate of bottom right of hot box
  FILE *url;          //file pointer to stream containing its URL
}
  *HL[256];           /*declare an array of pointers to 
                        hyperlink structures */
A 256-element array of pointers to structures of this type is declared. For the time being it will be presumed to be global (ie declared as being outside all functions. I think that an upper limit of 256 hyperlinks per screen is more than adequate. Memory for each structure will be allocated using malloc() each time a hyper-link is encountered when a new window full of text/graphics is displayed.

HyperCheck Function

Next is required a function to check whether or not the mouse is currently over hyper-text. This function must receive the current window co-ordinates of the mouse and the current number of hyper-links displayed on the screen (ie the number of hyper-links to be tested).
FILE *HyperCheck  //check whether or not mouse is over hyper-text
(
  int x,          //pixel x-co-ordinate of mouse within window
  int y,          //pixel y-co-ordinate of mouse within window
  int k           //number of hyper-links currently on screen
) {
  struct hyperlink *p;     //pointer to a hyperlink structure
  FILE *f = NULL;          //pointer to the hyper-link URL
  for(p = HL; k > 0; k--, p++) {
    if(y > *p->y1 && y < *p->y2 && x > *p->x1 && x < *p->x2)
      if(cursor == arrow) {
        [change mouse cursor from an arrow to a hand];
        f = *p->url;       //set hyper-link pointer to URL
      }
	else if(cursor == hand) {
        [change mouse cursor from hand to arrow];
      }
  }
  return(f);               //return pointer to hyper-link URL
}
The hyper-text pointed to by the mouse, if any, has a corresponding URL stored within the HTML file. The function returns a file pointer which points to the text of the URL within the HTML file. If the mouse is not currently pointing at any hyper-text it returns a NULL file pointer.

HyperLink Function

A function is now needed which is called automatically whenever the left mouse button is pressed. This function must check whether or not the mouse pointer is over hyper-text, and if so, read the corresponding URL from the HTML file and expedite the loading of the new file...
This page's parent within this Web Site. About this Web Site. Its home page. Email its Author.