thoughts on HTML-documentation-viewers

Under development: PCMCIA, wireless, etc.
Post Reply
Message
Author
User avatar
MU
Posts: 13649
Joined: Wed 24 Aug 2005, 16:52
Location: Karlsruhe, Germany
Contact:

thoughts on HTML-documentation-viewers

#1 Post by MU »

Concerning News-log:
http://puppylinux.com/news/comments.php ... 702-083738


HTML-viewers can be pretty (gtkmoz-embedded) and large, or small and not very capable (Dillo).

Must documentation-browser be capable? No.

They must be able to do display:
fat
italic
add a link
insert a picture.
This is sufficient. Tables or coloured backgrounds would be nice, but are not strongly needed.

Following this assumption, Dillo is overkill ;-)
Gtkbasic has a Textwidget. It currently misses an important part: markers.
With markers, you'd be able to display coloured, bold italic text and so on.
Gtkbasic also misses pixmap-support, but that is on top of my todo-list.

So if those functions would be implemented, you could write some Basic-code like this:

PSEUDO-Code, commands might not exist in this syntax:

Code: Select all

thefile = readfile(thehelpfile)
for each theline in thefile

  // find position of bold text
  tstart = instr( theline , "<b>" )
  tend = instr( theline , "</b>" )

  // remove HTML-code
  replace ( theline , "<b>" , "" )
  replace ( theline , "</b>" , "" )

  //add it to the textwidget
  addline_to_textview( textview1 , theline )
  t = set_textmodifier("bold")
  gtk("gtk_modify_textview" , t , tstart , tend)
next
So you would parse a text for simple HTML-tags, and "translate" them into directives for the textview.

Instead of Gtkbasic you could write such parsers for any other language, that supports Textwidgets with support of typical modern layout-stuff.

Mark

User avatar
BarryK
Puppy Master
Posts: 9392
Joined: Mon 09 May 2005, 09:23
Location: Perth, Western Australia
Contact:

#2 Post by BarryK »

Yeah, I wrote a basic html parser for my little EVE vector editor a few years ago. It read the basic tags for italic, bold, text-color. I know nothing about the "proper" way to write parsers, just started coding.

But turning it into a fully-fledged html viewer is another thing.

DavidBell
Posts: 132
Joined: Fri 24 Nov 2006, 21:44

#3 Post by DavidBell »

I did a crude XML parser in VB once that wasn't so different, one of the nice things about xml/html is that it was designed to be parsed so it tends to be easy to do.

Couple of problems I can see, first is I looked at the gtk text widget and I am wondering if it supports hyperlinks?

Second is that type of search-and-replace involves a lot of memory reallocation, especially if you use a a high level language which often use create new, copy, destroy old for every string operation. I have found in the past that these can seem to work well on a small file but slow down dramatically with large files (I also found to my cost once that string operations in VB take roughly 1000 times longer once the string is over 32KB).

So I think a better way to do this is allocate memory once for the processed string, then work through copying from the old string byte by byte, but looking ahead for tags and translating them on the fly. In pseudocode something like

Code: Select all

OldLength = Length(OldString)
String NewString(OldLength * 2)  // extra length just in case

OldMarker = 0
NewMarker = 0

While (OldMarker < OldLength)

    If (OldString(Marker) <> "<")
         'Copy Character
         NewString(NewMarker) = OldString(OldMarker)
         NewMarker++
         OldMarker++
    Else
         'Process Tag, Update Markers
    EndIf

Loop

I think there is one calley tinyxml on sourceforge that uses this method in C++

DB

Post Reply