DIYstompboxes.com

DIY Stompboxes => Building your own stompbox => Topic started by: waltk on September 01, 2011, 09:40:21 PM

Title: Here's how to save a large topic from the forum
Post by: waltk on September 01, 2011, 09:40:21 PM
Hi All,

Some of the forum threads here are the most valuable resource you could ever find when building a new circuit.  Even if it's not YOUR thread, chances are good that someone else has encountered your problem, and has shared the answer for everyone.

Some of the most interesting threads are also excruciatingly long, and tedious to review in their entirety.  So how do you find the diamond in the rough?

One way is to use the forum search feature (only available for registered users).  If you are lurking here, and haven't registered, you are missing out on a valuable tool.

But suppose your are interested in "Tube boost + overdrive running off a 9 volt battery".  That thread has 125 pages as of this post.  Suppose you are dying to know in great detail everything about a "stutter box" - that's a long thread too.  Even after you find the thread of interest, you may be in for some serious browsing time to find what you are looking for.  Most of us computer-savvy folks know that you can search for things within a page (usually from the Edit menu of your browser).  This helps, but doesn't work with a thread with many pages.

Here's the best part: right above the first post on any page, on the same line that has links to individual pages, all the way over at the right of your browser window - theres a button that reads: "Print".  If you click this button, you'll get a printable version of the entire topic.  I don't recommend printing a large topic (save the trees), but you can also see, browse, and search the entire topic at once.

What else can you do with the "Print" view? If you have Adobe Acrobat, you can create an offline viewable version of the topic.  If you save it to an HTML file on your PC, you can browse it any time - even if not connected.

What? You noticed a problem with this method?  Yes, none of the graphics - schematics, layouts, build photos, etc. show up in the print view.  Those graphics are some of the best parts.  In their place is a reference to the location of the graphic.  Here's what I do to to capture an entire thread with graphics for later reference and review...

Use the Print button to display an HTML page with the entire thread.
Save it on your computer as an HTML file.
Open the HTML file with a text editor that supports "regular expression" search and replace. (I use TextPad, but that's just my preference.)
(Here's the tricky part)  Construct a regular expression that replaces the references to images with html <img... references.
Save and open the resulting html page with your browser...

and you have the entire topic, including all graphics on one page.  You can then save it - as html, or a PDF (with Adobe Acrobat), or as MHT with IE.

OK, I realize that regular expressions are not within the repetroire of non-computer-geeks.  If this topic grows to over one page, I will write and place in the public domain a single-purpose .net executable program that will convert topic-print-html to an html version that will display all the graphics in that topic.

Anyone interested?
Title: Re: Here's how to save a large topic from the forum
Post by: defaced on September 01, 2011, 09:48:48 PM
Me likey.

Dumb question, will this executable run on linux/mac?  I use neither, but figured I'd ask so it doesn't come up after the fact. 
Title: Re: Here's how to save a large topic from the forum
Post by: chi_boy on September 01, 2011, 09:57:38 PM
That's actually pretty cool considering it's been right in front of me for so long and I never clicked that button.

Thanks!
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 01, 2011, 10:07:21 PM
QuoteDumb question, will this executable run on linux/mac?  I use neither, but figured I'd ask so it doesn't come up after the fact.

It will only run on Windows because Windows is the only platform that supports .Net (at least I don't think WINE supports it).

Linux geeks will undoubtedly already be familiar with regular expressions, and could easily do it with some ungodly complex single command-line statement (using GREP).
MAC users will not likely ever see this because they more interested touchy-feely artistic things than building their own stuff.

(Don't mean to start a firestorm of flaming here - the previous comments are truly meant to be funny, and not to slam people who don't know a real OS when they see one.)
Title: Re: Here's how to save a large topic from the forum
Post by: head_spaz on September 01, 2011, 11:04:26 PM
QuoteMAC users will not likely ever see this because they more interested touchy-feely artistic things than building their own stuff.

Who knew? And here I always thought they spent all their time in their Volvos... or on jury duty.

I think it would be great if each post had a button for users to optionally attribute it with some kind of rating/ranking, in the sense that highly-rated postings would likely have the most pertenent content. Like all of R.G.'s posts.
The search function seems a bit handicapped to me. I especially hate it when it times out on me.
Title: Re: Here's how to save a large topic from the forum
Post by: harmonic on September 01, 2011, 11:17:58 PM
Great bit of advice, Walt. I will use this so often! Thank you. :-)
Title: Re: Here's how to save a large topic from the forum
Post by: jbgron on September 01, 2011, 11:29:33 PM
You can run basic .NET apps on Linux and Mac using Mono.

http://www.mono-project.com

I'd prefer the single command line option though  ;D
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 01, 2011, 11:35:36 PM
QuoteThe search function seems a bit handicapped to me. I especially hate it when it times out on me.

Yes, it's frustrating when there are slowdowns.  I've notice improvements in the forum software as time goes on.  The timeouts could be the fault of the forum software, or the underlying database that posts are stored in, or even just the internet connection speed at any node between you and the server.

If you look at the fine print at the bottom of every page, you'll see that the DIYSTOMPBOXES Forum is running on SMF (Simple Machines Forum) version 1.1.11.  SMF is an open-source PHP-based forum software.  If one is running a linux/apache web server, this is not a bad choice (obviously, as we have all been using this for a while now).

The 1.1.11 version was released in June 2009, and the current version in this branch is 1.1.14 released in June of 2011.  At the same time version 2.0 has been released, and it will be the "stable production" version going forward.

It's up to Aron as to when new versions are incorporated into the site.  I can tell you that migrating to a new version of any server-based software is a pretty big deal.  I haven't updated my own web site since the turn of the century (really, it's been running since then 24/7 without any changes).  I'm sure Aron will consider upgrading as time and resources allow.

-Walt
Title: Re: Here's how to save a large topic from the forum
Post by: blooze_man on September 01, 2011, 11:51:33 PM
Quote from: jbgron on September 01, 2011, 11:29:33 PM
You can run basic .NET apps on Linux and Mac using Mono.

http://www.mono-project.com

I'd prefer the single command line option though  ;D

That's cool. Now I can do this on my "fake" OS.
Title: Re: Here's how to save a large topic from the forum
Post by: jbgron on September 01, 2011, 11:54:10 PM
Oh dear.  Lets not have an OS war in an electronics forum.
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 01, 2011, 11:55:31 PM
QuoteYou can run basic .NET apps on Linux and Mac using Mono.

Hmmm... not being a LinuxHead, I'm not sure I understand what would be required to run this under linux.  According to documentation in the link you provided, mono is compatible with .Net binaries.  I'm thinking this probably means assemblies, and not executables.  If a .Net executable will run under mono, that's fine with me.  It it takes significant extra effort, I don't expect to do it.

I like command-line utilities also, and would probably create the app to handle a simple command-line interface as well (so if you it pass a source file name and target file name, it will just do the conversion with no UI).

Remember though... the interesting part is the technique!  Taking the HTML source from the "Print" button, and just converting the text image references to a valid "img" tag - is the thrust of this post.  Some viewers would be able able to figure out how to do this without a standalone tool.

Title: Re: Here's how to save a large topic from the forum
Post by: jbgron on September 02, 2011, 12:00:04 AM
I haven't looked at Mono for a long time but last I heard you can run any .NET binary but their implementation of Windows.Forms is still a little buggy.  I'd ideally like to see this as a hosted solution that everyone can use, like http://url2pdf.com for example.
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 02, 2011, 12:09:00 AM
QuoteThat's cool. Now I can do this on my "fake" OS.

Yes, and some day you can aspire to getting a REAL OS...

KIDDING!  Really!  My comments are meant as gentle teasing.  It happens that I'm a professional developer working almost entirely on the MS platform.  I've had some exposure to other platforms, and had some major frustrations with the MS OS's.  But as any developer will tell you, having acquired the programming expertise in one environment, switching to another is not trivial.  I don't believe there is any intrinsic "goodness" in one or the other - as a practical matter, however, I can earn a living and feed my family by designing and developing software in the environment I know best.

-Walt
Title: Re: Here's how to save a large topic from the forum
Post by: jbgron on September 02, 2011, 12:12:39 AM
I'm also a professional developer working entirely on the UNIX platform.  I used to be an evangelist but I just can't be bothered anymore.  Somebody great once said, "Nobody has ever changed their mind as a result of losing an argument".  I couldn't agree more.
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 02, 2011, 12:26:32 AM
QuoteI'm also a professional developer working entirely on the UNIX platform.  I used to be an evangelist but I just can't be bothered anymore.  Somebody great once said, "Nobody has ever changed their mind as a result of losing an argument".

Well said, brother.

QuoteI'd ideally like to see this as a hosted solution that everyone can use

Yeah.  Hosting would be good.  I'm just not willing to jump into being the provider of a hosted solution that is so limited in usefulness.

Question for you as a unix guy, if you wanted to take an HTML source file, and just convert references to hosted images (embedded as text in parentheses instead of <img> tags), you would just use grep, right?

-Walt
Title: Re: Here's how to save a large topic from the forum
Post by: jbgron on September 02, 2011, 12:30:43 AM
Yep, grep and or sed.
Title: Re: Here's how to save a large topic from the forum
Post by: jbgron on September 02, 2011, 01:30:07 AM
Quick and dirty but this will do it;

sed -e 's|http://\([a-zA-Z0-9.\,:\/?\&=~%+_#-]*\)|<img src=\"http://\1\">\1/>|g;' print.html > print_fixed.html

Has a hangover of mangling non-image urls too but its all I have in me this late on a Friday afternoon (in Australia).
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 02, 2011, 11:12:40 AM
Quotesed -e 's|http://\([a-zA-Z0-9.\,:\/?\&=~%+_#-]*\)|<img src=\"http://\1\">\1/>|g;' print.html > print_fixed.html

Cool.  I'm not familiar with the sed syntax.

I would probably use this as the find expression  in Textpad:

(\(http://[^)]+gif\|jpg\|png\))

and this as the replacement expression:

<img src="\1" />

The .Net regex syntax is a little different, so the find expression would be:

\((http://[^)]+(gif|jpg|png))\)

and the replacement expression would be:

<img src="$1" />
Title: Re: Here's how to save a large topic from the forum
Post by: markeebee on September 02, 2011, 11:36:24 AM
Quote from: jbgron on September 02, 2011, 01:30:07 AM

sed -e 's|http://\([a-zA-Z0-9.\,:\/?\&=~%+_#-]*\)|<img src=\"http://\1\">\1/>|g;' print.html > print_fixed.html


Nice.  I see what you did there.


EDIT
I changed my avatar pic as a tribute.
Title: Re: Here's how to save a large topic from the forum
Post by: pinkjimiphoton on September 02, 2011, 11:43:18 AM
if ya google up cute pdf writer (free) you can also have it print the entire thread, with graphics, into a pdf file.
i do it all the time. you need to set the "print preview" to print ALL the pages, not just the one you're looking at.

edit: i just checked it out, it works but only for 11 pages at a time max. but, you can have it print everything to pdf, including all graphics, backgrounds, etc etc.

so in a big thread, you may have to make parts 1, 2,3, etc...but you can indeed capture the whole thing with just a few clicks, and easier than editing hypertext markup if you're not used to coding.

and cute pdf is freeware. ;)
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 02, 2011, 12:50:58 PM
Quotei do it all the time. you need to set the "print preview" to print ALL the pages, not just the one you're looking at.

Hmmm....  So Cute PDF Writer will actually navigate to other pages by driving your browser behind the scenes?  That's very surprising.  I wonder how it knows what to send to get from one page in a multipage topic to the next.  Wouldn't the navigational commands be different for various forum software?
Title: Re: Here's how to save a large topic from the forum
Post by: pinkjimiphoton on September 02, 2011, 12:59:41 PM
i just used cute pdf writer to archive the first 11 pages of the ludwig phase II tech notes into one pdf file, bro. want me to send it to ya so you can see?

for real...apparently in the free version, 11 pages is the max it will do. but that's including graphics, backgrounds, avatars, even smileys and stuff. if ya wanna print a 100 page thread, you'd have to go about 10 pages at a time, but it works.

hang on a sec, here's a sendspace of the first 11 pages of the thread i mentioned, in pdf format:

http://www.sendspace.com/file/azhdyz

check it out...first 11 of the 23 pages right there, with ALL graphics, even backgrounds.

Title: Re: Here's how to save a large topic from the forum
Post by: defaced on September 02, 2011, 01:10:47 PM
That's only the first page of the thread.  It takes 11 pages to print, but it's only 1 of 23 pages of the thread. Walt is talking about getting all 23 pages of the thread into one document.
Title: Re: Here's how to save a large topic from the forum
Post by: pinkjimiphoton on September 02, 2011, 01:17:35 PM
i know what you're talking about. i understand hypertext, have webmastered for years. just offered a way that may be easier for some people....they may find it easier to print to a couple pdf's than having to go thru the html code and change all the links to graphics so that the downloaded pages will work for offline viewing.

do what ya want. sorry, dude, trying to be helpful. feel free to do whatever ya wanna do.

me? to me, it's easier to print a couple pdf's with a couple mouse clicks, than to go thru huge hypertext files fixing code. but that's just me.  looking at it, you're right, it's only the first page of the thread...which would take 11 pages to "print". oh well. ;)

later.
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 02, 2011, 01:40:02 PM
QuoteThat's only the first page of the thread.  It takes 11 pages to print, but it's only 1 of 23 pages of the thread. Walt is talking about getting all 23 pages of the thread into one document.

Yep, I think we're talking about 2 different things - print pages vs. forum pages.  Turns out that using a PDF writer can still only print one forum page.  One page in a forum topic actually produces several virtual PDF pages when you print it.

I got the idea for this topic because I was looking at a topic that had 125 forum pages (that actually produced 438 print pages).  The goal of this topic was to demonstrate a way to save a file with the entire thread (including graphics) for offline browsing purposes.

I also prefer PDF documents, and I use a commercial PDF writer to produce them.  The Cute PDF writer is a great alternative (especially because it's free).  Thanks for suggesting that (pinkjimi)!

As far as "going thru huge hypertext files", what I offered to do in the original post was to write a small utility that does it for you.  Now that the topic has exceeded one page, I guess I'll have to man up and write it.  (I expect it to be trivial, though, as all it needs to do is apply one regular expression to the entire file.)

Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 03, 2011, 01:13:20 AM
OK, so here is the software that will take your saved 'Print' version of a topic, and convert the image references back into true image tags.

http://www.aronnelson.com/gallery/main.php?g2_view=core.DownloadItem&g2_itemId=46078&g2_GALLERYSID=56273756c9e85f5ead1f66d4270edfa0 (http://www.aronnelson.com/gallery/main.php?g2_view=core.DownloadItem&g2_itemId=46078&g2_GALLERYSID=56273756c9e85f5ead1f66d4270edfa0)

It's a zip file that contains a single executable.

Basic info about it:
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 03, 2011, 01:39:54 AM
OK.  So I already noticed one minor bug.  It doesn't have any impact on how the program works, but here it is:
After a file is converted, the status bar at the bottom of the window tells you the name of the output file.
It says the output file name is the same as the input file with an 'X' appended.
In reality, the 'X' is inserted before the extension (.htm), so you can just double-click the output file to open it in your browser.

Title: Re: Here's how to save a large topic from the forum
Post by: jbgron on September 03, 2011, 01:48:31 AM
Nice work Walt.
Title: Re: Here's how to save a large topic from the forum
Post by: waltk on September 06, 2011, 05:31:54 PM
Well, I guess most people didn't find the little utility I wrote very useful, 'cause there have only been a couple downloads.  Personally, I like to be able to archive certain threads.  You never know when some valuable schematic or layout image will go missing.  Also, it's nice to be able to search an entire thread.  ..but I guess that's just me.

Anyway, before I let this thread fade into the sunset, I've uploaded one last updated version of the utility.  New features include:


Here's the download link: http://www.aronnelson.com/gallery/main.php?g2_view=core.DownloadItem&g2_itemId=46108&g2_GALLERYSID=56273756c9e85f5ead1f66d4270edfa0 (http://www.aronnelson.com/gallery/main.php?g2_view=core.DownloadItem&g2_itemId=46108&g2_GALLERYSID=56273756c9e85f5ead1f66d4270edfa0)
Title: Re: Here's how to save a large topic from the forum
Post by: defaced on September 06, 2011, 10:40:10 PM
Thanks!  This will come in super handy in the near future. 
Title: Re: Here's how to save a large topic from the forum
Post by: jazbo8 on September 10, 2011, 01:55:15 AM
Just tried the program, it worked like a charm, highly recommended!

Jaz
:D