ThreadGet

Browsers, email, chat, etc.
Message
Author
gcmartin

This PET worked in FATDOG630

#31 Post by gcmartin »

donwloaded and installed in a pristine FATDOG630. Works! Thanks.

User avatar
davids45
Posts: 1326
Joined: Sun 26 Nov 2006, 23:33
Location: Chatswood, NSW

Spaced Out?

#32 Post by davids45 »

G'day again Flash,

I tried using a file name with a space (as per your posted image with a space in the file name for the htm file) and got nothing in ForumThreads directory.

Using a space-free file name and I got the desired threads file.

It's an old .pet and maybe needs an update to be able to include spaces in the file name?

David S.

tlchost
Posts: 2057
Joined: Sun 05 Aug 2007, 23:26
Location: Baltimore, Maryland USA
Contact:

broken links

#33 Post by tlchost »

No answer to question
Last edited by tlchost on Mon 10 Feb 2014, 07:55, edited 2 times in total.

User avatar
Flash
Official Dog Handler
Posts: 13071
Joined: Wed 04 May 2005, 16:04
Location: Arizona USA

#34 Post by Flash »

Thanks davids45, I hadn't even thought about the space in the filename. :oops: I'll give it a try without a space and report back, later.

Later: You were right. When I used an underscore instead of a space, ThreadGet downloaded three pages of a thread from the forum and put them in /root/ForumThreads, in a file I told it to name ThreadGet_test. :D

Tclhost: how did you open the html file? When I click on it, it opens in SeaMonkey but the links in the file don't seem to work for me either.

count
Posts: 5
Joined: Thu 30 Jan 2014, 22:16

#35 Post by count »

Hi davids45 - yep the download of the program worked, and it is running happily, have run some successful tests so far.

HOWEVER - I am looking at using this as a second stage to HTTracker on forum posts hundreds of pages long.

My method is three stages;
1 - Download the original thread to local, (scraping as much out as possible) - also looking at downloading printer versions of threads to strip out the bulk of the html code.
2 - and strip out as much html info as possible, leaving just the posts with related post date/time, poster name, etc;
3 - Put the scraped pages into one html file.

The main difficulty I am having is in running ThreadGet on locally downloaded websites; they maintain the relative file/folder structure, but the html names of the files have changed in the downloaded version.

When ThreadGet runs on the local html files, it 'ghosts' the dialogue boxes on the screen and goes through the motions, and the result is a blank html file.

Apologies if this sounds a bit vague, I am experimenting on the go, I can give more detail or better descriptions of what is happening.

I wondered if anyone had any issues like this when running ThreadGet on locally stored web pages?

Love the app btw, :D

@tlchost - I just wondered on your post - if it has merged all thread pages into one html file then presumably the links to page## and next, last etc will not work as they are now referring to other external pages... would this not require converting those links to # bookmarks referring into the same html page?

Thanks for this cracking little app!

User avatar
Flash
Official Dog Handler
Posts: 13071
Joined: Wed 04 May 2005, 16:04
Location: Arizona USA

#36 Post by Flash »

Count: downloading a thread that is hundreds of pages long will slow down the forum's server for everyone else while you're doing it. Please consider that while you're experimenting. :)

count
Posts: 5
Joined: Thu 30 Jan 2014, 22:16

#37 Post by count »

Flash wrote:Count: downloading a thread that is hundreds of pages long will slow down the forum's server for everyone else while you're doing it. Please consider that while you're experimenting. :)
Hi Flash, yes, thank you, I was aware of that too! It wasn't this forum btw :)

starhawk
Posts: 4906
Joined: Mon 22 Nov 2010, 06:04
Location: Everybody knows this is nowhere...

#38 Post by starhawk »

May I ask a favor? I'm on actually three forums, and one of them is migrating to phpBB from YaBB (specifically YaBB 1 Gold, Service Pack 1.1). I realize that this YaBB version is particularly antiquated at best (one of the main reasons for the move) BUT could ThreadGet be modified somehow to support it? I'd love to help them with the move, and being able to preserve the old posts is more than a little important.

Google tells me that YaBB is based on Perl, rather than PHP.

The forum in question is here --> http://forum.psion2.org/YaBB.pl

version2013
Posts: 503
Joined: Mon 09 Sep 2013, 00:00
Location: Florida, USA
Contact:

#39 Post by version2013 »

I was thinking of using ThreadGet or 'Mozilla Archive Format' to make a backup of some threads and host them on my site.
This is for when murga-linux.com goes down, as it does occasionally.

Is this allowed?


Mozilla Archive Format (a browser extension)
http://maf.mozdev.org/
http://en.wikipedia.org/wiki/Mozilla_Archive_Format

User avatar
Flash
Official Dog Handler
Posts: 13071
Joined: Wed 04 May 2005, 16:04
Location: Arizona USA

#40 Post by Flash »

Yes, with proper attribution.

count
Posts: 5
Joined: Thu 30 Jan 2014, 22:16

#41 Post by count »

Hey guys, apologies for the delay I have been busy with other things!

However, I am returning to my data projects, and your help and assistance would be invaluable!

The first issue I am having is threadget is making multiple copies of the first html page.

So I set first page as page 1, and number of pages as 7, and I get 7 versions of page 1 concatenated!

Also, when I run Threadget on the same website locally, it results in a blank file.

Any ideas?

In the meantime I shall do a bit more testing, and report back the results.

stray_dog
Posts: 65
Joined: Wed 19 Mar 2014, 00:14

#42 Post by stray_dog »

Just tried this .pet for the first time tonight using slacko 5.6 and it worked SO well. Thank you so much! This is such a helpful thing for me.

hamoudoudou

a good pet

#43 Post by hamoudoudou »

Just tried this .pet for the first time tonight using slacko 5.6 and it worked SO well. Thank you so much! This is such a helpful thing for me.
me too

Post Reply