Link to home
Start Free TrialLog in
Avatar of Joe Winograd
Joe WinogradFlag for United States of America

asked on

Workaround to print code snippets in EE articles

Hi Folks,

When printing an EE article, code snippets are jumbled up...they look like this:

User generated image
Gustav Brock submitted a bug report on this more than two years ago:
Code blocks of article print view are unreadable

The EE Mod at the time, Modalot, said this:
Ouch. That renders the printout almost useless. I've filed this internally as a major bug.
Not being aware of Gustav's bug report, I submitted one about two months ago:
Print feature at articles does not format code snippets properly

In response to my bug report, the same Mod (Modalot) informed me that it is an old bug and is "still unresolved". Being more than 2-1/4 years old and filed as a "major bug" by a Mod who has been an EE member for more than 10 years, it seems to me that EE is either incapable of, or not interested in, fixing the bug. As a result, I'm looking to the experts here for a workaround. I want to use the workaround to "print-to-PDF" my 63 EE articles and the Steps at my 47 EE video Micro Tutorials. Although that's 110 publications, only some of them have code snippets, so while I would prefer as automated a solution as possible, having to do some manual effort on each one is fine. I'm also fine with a solution that requires other products, such as Microsoft Word, or other commercial software. Thanks for your ideas! Regards, Joe
Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

The first thing that comes to mind is to just throw together a chrome extension to rewrite the code block content. Assuming it is a DIV today, maybe wrap the content into a PRE tag inside.

A chrome extension could do that automatically when you load the page and you could just run it in developer mode to make it a quick on/off thing.
You could also write a web-scraping PHP script to go pull down all your articles, use SimpleHtmlDom to parse the content, and then use TCPDF to generate PDFs from the content. It would be a bit of effort to get the PDF generation / formatting right but it would likely be the fastest automated solution.
are you trying to print outside, excluding the EE stuff.

Look at printing using html2ps or HTML2pdf

The issue might be with overlays.
Avatar of Joe Winograd

ASKER

Hi gr8gonzo,

> The first thing that comes to mind is to just throw together a chrome extension to rewrite the code block content.

I've never coded a Chrome extension...don't have a clue how to do it.

> Assuming it is a DIV today, maybe wrap the content into a PRE tag inside.

All of my development work is on the Windows client side...nothing on the web. I don't know what "DIV" or "PRE tag" is.

> You could also write a web-scraping PHP script to go pull down all your articles, use SimpleHtmlDom to parse the content, and then use TCPDF to generate PDFs from the content.

As mentioned earlier, I'm not a web developer...no knowledge of PHP coding.

Regards, Joe
Do you ave an HTML editor, wysiwyg

What browser are you accessing and from which you print.
The issue potentially deals with spacing, layout.

The point of the suggestions deals with reviewing the HTML code on the page that separates the content, question/answer article content from the rendering template......
Hi arnold,

> are you trying to print outside, excluding the EE stuff.

No. I simply want to print what the EE Print icon creates. For example, visit this article:
https://www.experts-exchange.com/articles/33943/Automatically-download-files-from-the-web-AutoHotkey-Script.html

Look at the code block there...it displays perfectly.

At the bottom of the article, you'll see the Menu icon, which is three horizontal dots. Click that and you'll see the Print icon. Click that and it will create a printable page with this URL:
https://www.experts-exchange.com/articles/33943/Automatically-download-files-from-the-web-AutoHotkey-Script.html?printer=true

That's where it gets messed up...look at the code block there.

> Look at printing using html2ps or HTML2pdf

How would I do that for an EE article, such as the one mentioned above?

> The issue might be with overlays.

I don't know what you mean by that, but if it is the problem, how do you fix it so that the code block prints correctly?

Regards, Joe
The issue is the div that contains the snippet is limited, made scrollable. Not at a system to look at the HTML code, usually, any browser hitting F11 or F12 will activate the developer mode that will enable review of the elements.

The div with code, needs to be changed from scrolling/able to explicit

Forgot
The <pre>  Is a pre formatted content means when displyed in the browser it is displayed as is versus HTML often does not abide by returns, spaces

I'll take a look at the HTML source and will try to identify the points of interest for you.
Hi arnold,

> Do you ave an HTML editor, wysiwyg

No, but I just Googled "html wysiwyg editor" and the first two (paid) hits are for CKEditor and Froala, both of which are complete garbage. CKEditor is the first product that EE switched to when it stopped using the Comment-block editor for articles. When EE realized how bad the CKEditor is, they switched to Froala (the current one), but it is not better than CKEditor. Anyway, do you have a recommendation for a decent HTML WYSIWYG editor?

> What browser are you accessing and from which you print.

I've tried many browsers (Chrome, Firefox, Opera, and others), but I think that the browser is irrelevant. The issue seems to be that the EE Print icon creates a garbled code block. Btw, printing the actual article page (rather than the page from the Print icon) doesn't work at all.

> The issue potentially deals with spacing, layout.

If so, is there any way to fix it?

> The point of the suggestions deals with reviewing the HTML code on the page that separates the content, question/answer article content from the rendering template......

That's probably why EE provides the Print icon.

Regards, Joe
Hi arnold,

> The div with code, needs to be changed from scrolling/able to explicit

Interesting! Is that all EE has to do to fix it? If so, I wonder why they haven't done it in the last two years.

> The <pre> Is a pre formatted content means when displyed in the browser it is displayed as is versus HTML often does not abide by returns, spaces

Thanks for explaining that.

> I'll take a look at the HTML source and will try to identify the points of interest for you.

That would be wonderful! Thanks very much, Joe
Hi,

If you click on Open in new window link under the code box it will open in new page
https://www.experts-exchange.com/viewCodeSnippet.jsp?codeSnippetId=30-33936-1
then the print CTRL-P work ok

You can use https://www.printfriendly.com/ to generate the PDF from this page.

You can save the code using Select all link under the code box and copy/paste the content to notepad or copy to JSfiddle to save the snippets.
You said you do Windows client side development. Were you hoping to have a code solution that you could maintain yourself, and if  so, what language do you program in?

If not, are you asking someone to build something for you?
Here is the issue, the print option, does 90% of the work.
Here is the thing that simplifies things, the print method can be activated by adding ?printer=true to the URL of your article.
The way the code snippet is layed out in the HTML it is not fixable.
the code base is within a <PRE> tag but relies on other parameters for formating including cascading style sheets.

Adding to the above information that designates the print, ..
while also using the method lenamtl's suggestion

print the article
print the code bases

Then you print to pdf


then you pdf process to extract the wrong code base and replace it with the correct code base.
OCR conversion in a PDF writer of the sections might help.
have

you indicated you have the content on your own system.
It might be simpler to arrange/format them at your source.

stripping HTML from the data will whack the formatting even worse than the one in the code snippet.


print=true will give you a printable version of the page. note you have to make sure there is only one ? (QUERY_STRING) indicator in the line.

i.e. my_article.html?print=true
the code snippet pull out need only have &print=true added.
Hi lenamtl,

> then the print CTRL-P work ok

Nope. It cuts off when printing. For example, the page looks fine when doing the Open in new window (has horizontal and vertical scroll bars), but when printing it, you get just some lines and some columns, not all...no scroll bars and the rest is cut off. Example:

https://www.experts-exchange.com/viewCodeSnippet.jsp?codeSnippetId=30-33943-1

That looks fine there...has scroll bars...but Ctrl+P produces this:

User generated image
> You can use https://www.printfriendly.com/ to generate the PDF from this page.

I typically use a PDF print driver for that...have many installed...Adobe PDF, Bullzip, CutePDF, doPDF, and more.

> You can save the code using Select all link under the code box and copy/paste the content to notepad or copy to JSfiddle to save the snippets.

Yes, or I can copy/paste from the source code file on my PC.

Thanks for your ideas. Regards, Joe
We are working on the editor now, so I will get one of my guys to look into making the print version work better.
Potentially, the print does double duty on the included code.
I.e. instead of separating the included code and processing it separately,

the cascading Style Sheet still applies to the PRE..... which potentially deforms it in the printer view.
> Were you hoping to have a code solution that you could maintain yourself, and if so, what language do you program in?

I've done most of my programming in recent years with AutoHotkey. In fact, I wrote a program in it that can download all the EE articles and video pages that are specified in a file list...a plain text file with their URLs. I can make those URLs the printer=true version. Btw, I published an article and two videos on that program:

How to download number of Views, Endorsements, Points for Experts Exchange Articles and Videos
How to download number of Views, Endorsements, Points for Experts Exchange Articles and Videos--Demo
ArticlesVideosEE: Download statistics on Experts Exchange Articles and Videos - Demo of Enhancements

> If not, are you asking someone to build something for you?

As mentioned earlier, I'd be happy to purchase commercial software. I'd also be happy to pay someone to develop a solution. EE used to have Gigs and the Hire Me button, but they are gone, and I think that the approved mechanism is now to do this via PM.

Thanks for all your comments. Regards, Joe
It is not an issue with software, though if you have abbvy using the command line tools you might be able to get the article and OCR
strip out the mangled code section, while adding the codesnipet printer view to complete the document that will be written out to PDF.
it might rely on client javascript....
> Here is the thing that simplifies things, the print method can be activated by adding ?printer=true to the URL of your article.

Yes, I posted that yesterday, but it doesn't solve the problem, as noted in subsequent posts.

> The way the code snippet is layed out in the HTML it is not fixable. the code base is within a <PRE> tag but relies on other parameters for formating including cascading style sheets.

Very interesting!

> print the article
> print the code bases
> Then you print to pdf

Yep, thought of that before posting this question, but that's a lot of manual effort.

> then you pdf process to extract the wrong code base and replace it with the correct code base.

Won't need to do that if I simply print the code to a PDF, then include that PDF in the overall PDF.

> OCR conversion in a PDF writer of the sections might help.

I don't like to use OCR when everything is already text.

> you indicated you have the content on your own system.
> It might be simpler to arrange/format them at your source.

Yes, I'm leaning towards that.

> stripping HTML from the data will whack the formatting even worse than the one in the code snippet

Good point!

> print=true will give you a printable version of the page.

Yes, but see earlier comments about it.

> the cascading Style Sheet still applies to the PRE..... which potentially deforms it in the printer view.

Beyond my knowledge, but sounds as if it could be the culprit.

Thanks again, Joe
Hi Jeffrey,

> We are working on the editor now, so I will get one of my guys to look into making the print version work better.

That's great news and I really appreciate it! But it certainly begs the question of why the bug reports from two years ago and two months ago did not get any action.

Btw, to be clear, in terms of your "making the print version work better" comment, the print version works fine as is, imo, except for this one bug...the garbled code snippet. Regards, Joe
Hi Arnold,

> if you have abbvy

I do have ABBYY FineReader...and Adobe Acrobat DC Pro, Kofax/Nuance OmniPage Ultimate, Kofax/Nuance PaperPort Pro, Kofax/Nuance Power PDF Advanced, Tracker PDF-XChange Editor Pro, and lots more, but I'm still inclined to think that OCR is not the right way to go. Regards, Joe
Okay, given all that, and the fact you already have an automated script to do the download, the simplest approach is probably just do a simple Fiddler hook to swap out the code block with one that doesn't get re-processed.

Should be a quick thing - give me one moment and I'll put up a file for you.
ASKER CERTIFIED SOLUTION
Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Also, if you're experienced at all with C#, you can extend my class to inject your own formatting if you want.

In any event, when Fiddler is running, it should act as the default system proxy. So IE and Chrome should both see the updated HTML automatically, along with any other HTTP client that obeys the default system proxy, which I -think- should include your own script that you were using to automate the download/printing.
> you already have an automated script to do the download

Yes, and it will be very easy to modify the script to save the downloads to files with unique names (the options in the posted script are to delete the file or save it with the same file name, i.e., overwrite it).

> just do a simple Fiddler hook to swap out the code block with one that doesn't get re-processed

Sounds promising!

> Should be a quick thing - give me one moment and I'll put up a file for you.

Much appreciated!
Hi gr8gonzo,
Our messages crossed. I'll study your last two posts...thanks very much for all that effort! Regards, Joe
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi gr8gonzo,

If it's not too much trouble, could you try your method on this article and attach the resulting PDF (either Print>Save As PDF in Chrome or print to a PDF print driver from any browser):
https://www.experts-exchange.com/articles/33943/

Thanks much, Joe
Sure. Here you go, using the Chrome "Save as PDF".
Automatically-download-files-from-th.pdf
> Here you go, using the Chrome "Save as PDF".

Very nice! Is it possible to remove the white circle from the lower left of every page:

User generated image
Regards, Joe
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
> I'm happy to help out but we'll probably have to discuss some kind of contract if there are more requests, so this doesn't just become a free work thing. :)

I'm in complete agreement. Just hope that Scott Fell doesn't see that post. I've been taken to the woodshed for similar comments. :)
Hi Jeffrey,

> We are working on the editor now

While you're working on it, could you please fix this bug that I submitted a year-and-a-half ago:

https://www.experts-exchange.com/bugs/23488/Author's-name-missing-when-printing-an-article-Editor's-name-s-too.html

It would really be nice for the author's name to appear in the article when printing. :)

Thanks, Joe
Hi Arnold,

> Try the following, disable JavaScript in your browser and it will display the entire code snippet, but because EE did not place the JavaScript code within HTML comment tags to make them invisible when JavaScript is not available, they are included inthe tet.

Well, now, isn't that interesting?! Disabling JavaScript breaks lots of other features at the site, but so far it is working very nicely for this purpose...great idea! Thanks, Joe
Thanks to everyone who helped! Much appreciated!

Special note for Jeffrey: Please post back here when your guys fix this so that the normal Print function works correctly. Thanks!
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The fix to the printer version of the styles went out this past Monday, so you should be able to see the code snippets in the printed version now.
My sincerest thanks to the devs for this...truly appreciated!
Hi Jeffrey,

One more ask...the author's name appears nowhere when printing an article. Here's the 26-Oct-2018 bug report on that one:

Author's name missing when printing an article [Editor's name(s), too]

Btw, Martin fixed today the chat circle that was obliterating the lower left of the first page when printing. Regards, Joe
Ok, I will get a ticket going
> Ok, I will get a ticket going

Thanks! A simple by-line would be terrific.
Hi Jeffrey,

One other thing....when your folks put in the by-line for printed articles, please have them do it for the printed Steps at video Micro Tutorials, too, where the by-line is also missing, for example:

User generated image
Thanks, Joe