From pagaltzis at gmx.de Tue Apr 1 00:45:42 2008 From: pagaltzis at gmx.de (Aristotle Pagaltzis) Date: Tue, 1 Apr 2008 06:45:42 +0200 Subject: On ampersands in query strings (was: HTML entities in URLs and urlencoding) In-Reply-To: References: Message-ID: <20080401044542.GH3690@klangraum> * Waylan Limberg [2008-04-01 03:50]: > As far as I can tell, the "&" breaks the query string. No, it doesn?t, as you found out. However, on a tangential note: if you write web apps, *please* make sure that you support the semicolon as a query parameter separator as well as the ampersand: http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2 More importantly, please **please** make sure that the URIs your code generates use semicolons rather than ampersands. Semicolons need not be escaped in HTML and XML, which makes copy-pasting users much less likely to produce invalid markup regardless of the context they?re working in. Even though this W3C recommendation is over a decade old, use of ampersands in query strings persists. (In fact, PHP not only does not emit URIs with semicolon-separated query strings, by default it cannot even parse them! You need to set an unbreak-me config option to make it recognise the semicolon.) Regards, -- Aristotle Pagaltzis // From david at idiomatrix.com Tue Apr 1 05:48:40 2008 From: david at idiomatrix.com (David Herren) Date: Tue, 1 Apr 2008 05:48:40 -0400 Subject: HTML entities in URLs and urlencoding In-Reply-To: References: Message-ID: <693B1B09-92E1-4C16-A61F-8CCF0AEEEF67@idiomatrix.com> The escaped ampersand doesn't break any of the URLs that I use regularly...in fact when I write straight xhtml I encode them in my URLs so that my pages still pass the xhtml 1.0 strict. On Mar 31, 2008, at 9:45 PM, Waylan Limberg wrote: > Is there something I'm missing or is this a bug? As far as I > can tell, the "&" breaks the query string. /david -- david herren - shoreham, vt us na terra solsys orionarm "I didn't attend the funeral, but I sent a nice letter saying I approved of it." --- Mark Twain From tom at jumpingrock.net Sun Apr 6 00:45:42 2008 From: tom at jumpingrock.net (Tom Humiston) Date: Sun, 6 Apr 2008 00:45:42 -0400 Subject: buggy HTML from nested lists w/ paragraphs Message-ID: <5294C44A-7A98-4AAA-B22D-5604654AF23C@jumpingrock.net> I ran into a weird problem while writing my own little guide to Markdown (using Markdown, of course). The document is mostly made of list items in nested ul's. As I neared the end, I found (via MarsEdit's preview) that in place of two sections of content, Markdown had begun to generate gibberish strings like this: aa9ca05a7c006bc5e5c091c00aee0cd7 After weeding through different sections of my text to find an offending pattern (and even using TextWrangler to search for stray Unicode and the like), I found that it seems to result from a particular sequence, or type of sequence, involving headers, list items, and paragraphs, regardless of their content. With all non-offending content weeded out, and 'text(n)' substituted for each line of actual content, I'm left with the text below, still generating the two errors. Each line of code has a blank line above it. Removing the blank lines or the final item, text13, is enough to kill either or both of the errors, while removal of other items kills one error or the other. (Removal of text8 kills neither, but I left it in case it's helpful for diagnosing the problem.) # text1 * text2 * text3 text4 ## text5 * text6 * text7 text8 ## text9 * text10 * text11 text12 text13 The versions of Markdown I'm running this through are version 1.0.1 of the Perl script from daringfireball, and whatever's included in MarsEdit 2.1.3(1404). Any ideas? From bobtfish at bobtfish.net Sun Apr 6 06:40:18 2008 From: bobtfish at bobtfish.net (Tomas Doran) Date: Sun, 6 Apr 2008 11:40:18 +0100 Subject: buggy HTML from nested lists w/ paragraphs In-Reply-To: <5294C44A-7A98-4AAA-B22D-5604654AF23C@jumpingrock.net> References: <5294C44A-7A98-4AAA-B22D-5604654AF23C@jumpingrock.net> Message-ID: On 6 Apr 2008, at 05:45, Tom Humiston wrote: > I ran into a weird problem while writing my own little guide to > Markdown (using Markdown, of course). The document is mostly made > of list items in nested ul's. As I neared the end, I found (via > MarsEdit's preview) that in place of two sections of content, > Markdown had begun to generate gibberish strings like this: > > aa9ca05a7c006bc5e5c091c00aee0cd7 > > After weeding through different sections of my text to find an > offending pattern (and even using TextWrangler to search for stray > Unicode and the like), I found that it seems to result from a > particular sequence, or type of sequence, involving headers, list > items, and paragraphs, regardless of their content. > > With all non-offending content weeded out, and 'text(n)' > substituted for each line of actual content, I'm left with the text > below, still generating the two errors. Each line of code has a > blank line above it. Removing the blank lines or the final item, > text13, is enough to kill either or both of the errors, while > removal of other items kills one error or the other. (Removal of > text8 kills neither, but I left it in case it's helpful for > diagnosing the problem.) > The versions of Markdown I'm running this through are version 1.0.1 > of the Perl script from daringfireball, and whatever's included in > MarsEdit 2.1.3(1404). > > Any ideas? Markdown 1.0.2b8 fixed this bug, as has Text::Markdown. I'd recommend that you upgrade to one of these.. Various other implementations also get it right: http://babelmark.bobtfish.net/?markdown=%23+text1%0D%0A%0D%0A++* +text2%0D%0A%0D%0A++++*+text3%0D%0A%0D%0A++++text4%0D%0A%0D%0A%23%23 +text5%0D%0A%0D%0A++*+text6%0D%0A%0D%0A++++*+text7%0D%0A%0D%0A++text8% 0D%0A%0D%0A%23%23+text9%0D%0A%0D%0A++*+text10%0D%0A%0D%0A++++*+text11% 0D%0A%0D%0A++++text12%0D%0A%0D%0A++++text13 Cheers Tom From jgm at berkeley.edu Sun Apr 6 13:50:39 2008 From: jgm at berkeley.edu (John MacFarlane) Date: Sun, 6 Apr 2008 10:50:39 -0700 Subject: markdown PEG (parsing expression grammar) Message-ID: <20080406175038.GA12740@berkeley.edu> There's been a lot of discussion on this list about creating a formal grammar for markdown. I had a go at writing a [parsing expression grammar] for markdown. I used Haskell and John Meacham's Frisby PEG parsing library, but it should not be too hard to port the grammar to PEG libraries in other languages. [parsing expression grammar]: http://en.wikipedia.org/wiki/Parsing_expression_grammar The code is at . All that is required to compile it is the GHC haskell compiler. The grammar itself begins at line 68 of Markdown.hs: . This implementation is not particularly fast, though it is still faster than Markdown.pl. It should be fairly easy to modify and extend, though, and the grammar provides a precise specification of (one interpretation of!) the syntax of markdown, which might be a useful basis for discussion. John From drdrang at gmail.com Sun Apr 6 14:27:10 2008 From: drdrang at gmail.com (Dr. Drang) Date: Sun, 6 Apr 2008 13:27:10 -0500 Subject: markdown PEG (parsing expression grammar) In-Reply-To: <20080406175038.GA12740@berkeley.edu> References: <20080406175038.GA12740@berkeley.edu> Message-ID: <8CB633F0-3CE8-43B0-923F-D912AB13B4DB@gmail.com> Forgive my ignorance, but can this sort of grammar be converted to the form used by lex and yacc? -- Dr. Drang www.leancrew.com/all-this On Apr 6, 2008, at 12:50 PM, John MacFarlane wrote: > There's been a lot of discussion on this list about creating a formal > grammar for markdown. I had a go at writing a [parsing expression > grammar] for markdown. I used Haskell and John Meacham's Frisby PEG > parsing library, but it should not be too hard to port the grammar > to PEG libraries in other languages. From jgm at berkeley.edu Sun Apr 6 17:04:32 2008 From: jgm at berkeley.edu (John MacFarlane) Date: Sun, 6 Apr 2008 14:04:32 -0700 Subject: markdown PEG (parsing expression grammar) In-Reply-To: <8CB633F0-3CE8-43B0-923F-D912AB13B4DB@gmail.com> References: <20080406175038.GA12740@berkeley.edu> <8CB633F0-3CE8-43B0-923F-D912AB13B4DB@gmail.com> Message-ID: <20080406210431.GB13502@berkeley.edu> No, not in general. +++ Dr. Drang [Apr 06 08 13:27 ]: > Forgive my ignorance, but can this sort of grammar be converted to the form > used by lex and yacc? > > -- > Dr. Drang > www.leancrew.com/all-this > > On Apr 6, 2008, at 12:50 PM, John MacFarlane wrote: > >> There's been a lot of discussion on this list about creating a formal >> grammar for markdown. I had a go at writing a [parsing expression >> grammar] for markdown. I used Haskell and John Meacham's Frisby PEG >> parsing library, but it should not be too hard to port the grammar >> to PEG libraries in other languages. > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss > From tom at jumpingrock.net Mon Apr 7 09:42:56 2008 From: tom at jumpingrock.net (Tom Humiston) Date: Mon, 7 Apr 2008 09:42:56 -0400 Subject: buggy HTML from nested lists w/ paragraphs In-Reply-To: References: <5294C44A-7A98-4AAA-B22D-5604654AF23C@jumpingrock.net> Message-ID: <98770466-B1DF-469A-931B-E801772B2E5E@jumpingrock.net> My immediate problem has, if not been fixed, at least gone away: While the errors I saw (using Markdown 1.0.1) were in a cheat sheet for my own use, my WordPress blog (the reason I'm learning markdown syntax) is set up with PHP Markdown Extra 1.1.7, which has no trouble with the text that 1.0.1 choked on. On Apr 6, 2008, at 6:40 AM, Tomas Doran wrote: > > On 6 Apr 2008, at 05:45, Tom Humiston wrote: >> I ran into a weird problem while writing my own little guide to >> Markdown (using Markdown, of course). The document is mostly made >> of list items in nested ul's. As I neared the end, I found (via >> MarsEdit's preview) that in place of two sections of content, >> Markdown had begun to generate gibberish strings like this: >> >> aa9ca05a7c006bc5e5c091c00aee0cd7 >> >> After weeding through different sections of my text to find an >> offending pattern (and even using TextWrangler to search for stray >> Unicode and the like), I found that it seems to result from a >> particular sequence, or type of sequence, involving headers, list >> items, and paragraphs, regardless of their content. >> >> With all non-offending content weeded out, and 'text(n)' >> substituted for each line of actual content, I'm left with the >> text below, still generating the two errors. Each line of code has >> a blank line above it. Removing the blank lines or the final item, >> text13, is enough to kill either or both of the errors, while >> removal of other items kills one error or the other. (Removal of >> text8 kills neither, but I left it in case it's helpful for >> diagnosing the problem.) > > > >> The versions of Markdown I'm running this through are version >> 1.0.1 of the Perl script from daringfireball, and whatever's >> included in MarsEdit 2.1.3(1404). >> >> Any ideas? > > Markdown 1.0.2b8 fixed this bug, as has Text::Markdown. > > I'd recommend that you upgrade to one of these.. > > Various other implementations also get it right: > > http://babelmark.bobtfish.net/?markdown=%23+text1%0D%0A%0D%0A++* > +text2%0D%0A%0D%0A++++*+text3%0D%0A%0D%0A++++text4%0D%0A%0D%0A%23%23 > +text5%0D%0A%0D%0A++*+text6%0D%0A%0D%0A++++*+text7%0D%0A%0D%0A+ > +text8%0D%0A%0D%0A%23%23+text9%0D%0A%0D%0A++*+text10%0D%0A%0D%0A+++ > +*+text11%0D%0A%0D%0A++++text12%0D%0A%0D%0A++++text13 > > Cheers > Tom > > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss From jgm at berkeley.edu Tue Apr 8 18:37:06 2008 From: jgm at berkeley.edu (John MacFarlane) Date: Tue, 8 Apr 2008 15:37:06 -0700 Subject: markdown PEG (parsing expression grammar) In-Reply-To: <20080406210431.GB13502@berkeley.edu> References: <20080406175038.GA12740@berkeley.edu> <8CB633F0-3CE8-43B0-923F-D912AB13B4DB@gmail.com> <20080406210431.GB13502@berkeley.edu> Message-ID: <20080408223706.GA1804@berkeley.edu> But see http://piumarta.com/software/peg/, a lex/yacc replacement that uses PEG. No doubt there are others as well. +++ John MacFarlane [Apr 06 08 14:04 ]: > No, not in general. > > +++ Dr. Drang [Apr 06 08 13:27 ]: > > Forgive my ignorance, but can this sort of grammar be converted to the form > > used by lex and yacc? > > > > -- > > Dr. Drang > > www.leancrew.com/all-this > > > > On Apr 6, 2008, at 12:50 PM, John MacFarlane wrote: > > > >> There's been a lot of discussion on this list about creating a formal > >> grammar for markdown. I had a go at writing a [parsing expression > >> grammar] for markdown. I used Haskell and John Meacham's Frisby PEG > >> parsing library, but it should not be too hard to port the grammar > >> to PEG libraries in other languages. > > _______________________________________________ > > Markdown-Discuss mailing list > > Markdown-Discuss at six.pairlist.net > > http://six.pairlist.net/mailman/listinfo/markdown-discuss > > > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss > From sgbotsford at gmail.com Sat Apr 19 18:28:54 2008 From: sgbotsford at gmail.com (Sherwood Botsford) Date: Sat, 19 Apr 2008 16:28:54 -0600 Subject: Feature Request External label resolution Message-ID: <480A7226.2000904@gmail.com> One of the things I'm coming up against. Maintaining a non-small web site with many internal links is a pain. Consider: Suppose that at one point I have site/ Images Business Home ... Later the site gets more complex, and Images has a bunch of sub directories. site/ Images header_rotate inventory_pix misc Business Home When this happens I have to change the link for every pic on every page. If I use an image in 6 places, I have to change it in 6 places. HOWEVER Suppose I cleverly used the footnote form of links. I.e: [Image alt text][LABEL] Suppose that markdown was clever enough to reference an external file (in .markdownrc of course) for the resolution of LABEL. NOW when I re-arrange the universe, I only have to change the reference in this one file, NOT in every file that references it. From public at quillio.com Sat Apr 19 22:15:07 2008 From: public at quillio.com (Lou Quillio) Date: Sat, 19 Apr 2008 22:15:07 -0400 Subject: Feature Request External label resolution In-Reply-To: <480A7226.2000904@gmail.com> References: <480A7226.2000904@gmail.com> Message-ID: > Suppose that markdown was clever enough to reference an external > file (in .markdownrc of course) for the resolution of LABEL. > > NOW when I re-arrange the universe, I only have to change the reference in > this one file, NOT in every file that references it. Good idea to tokenize URL paths and what not, but it isn't Markdown's job to transform them for you ;). You'll want to pre-process those with your own script, then do your Markdown transforms. LQ From 29mtuz102 at sneakemail.com Sat Apr 19 23:35:56 2008 From: 29mtuz102 at sneakemail.com (Allan Odgaard) Date: Sun, 20 Apr 2008 05:35:56 +0200 Subject: Feature Request External label resolution In-Reply-To: <480A7226.2000904@gmail.com> References: <480A7226.2000904@gmail.com> Message-ID: <16913-21083@sneakemail.com> On 20 Apr 2008, at 00:28, Sherwood Botsford wrote: > [...] > Suppose that markdown was clever enough to reference an external > file (in .markdownrc of course) for the resolution of LABEL. Here?s a simple shell script to convert all markdown to HTML and using a shared references file: cd ~/MySite/pages for f in *.mdown; do cat "$f" references|Markdown.pl > "../html/${f%.mdown}.html" done I use something like that myself where I also have a command to update my references list, that is, grep through the pages for undefined references and add these to the references file (where I will then need to add the URI). From sgbotsford at gmail.com Sun Apr 20 02:02:25 2008 From: sgbotsford at gmail.com (Sherwood Botsford) Date: Sun, 20 Apr 2008 00:02:25 -0600 Subject: Feature Request External label resolution In-Reply-To: References: <480A7226.2000904@gmail.com> Message-ID: <480ADC71.9090603@gmail.com> Lou Quillio wrote: >> Suppose that markdown was clever enough to reference an external >> file (in .markdownrc of course) for the resolution of LABEL. >> >> NOW when I re-arrange the universe, I only have to change the reference in >> this one file, NOT in every file that references it. > > Good idea to tokenize URL paths and what not, but it isn't Markdown's > job to transform them for you ;). You'll want to pre-process those > with your own script, then do your Markdown transforms. > > Right now if I use the 'footnote' style of markdown link notation I have to have the resolution of hte footnote in the same file. What I'd like to do is consolidate all those resolution lines in a single file. Lets suppose that my tree farm inventory page lists 6 kinds of spruce available in 4 sizes each. Every time that kind appears, I want to have a pair of links that go to the spruce overview page, and the species specific spruce page. Every time the size appears, I want a link that goes to the price/quantity table for that species. Now elsewhere in the web page, in the advice page, I talk about the formal look of Meyer's spruce versus the much more inforaml look of Norway spruce. In each case I want Meyer's to show as a link to the Meyer's spruce page, and Norway to point to the Norway spruce page. In my blog I may talk about how cute this year's Meyers' (Link again) spruce are... The Meyer's spruce page may have 40 links to it from other pages in my web site. In turn that page may have many links from it to articles. E.g: Care and feeding of spruce in general. A comparison of the different spruces. Spruce vs Pine vs Fir, If I move a link target I have to edit 50 files. I don't want to edit 50 files for non-visible changes. I want a system where Markdown sees the link label, and goes to a single source, no matter what page called it to resolve that label into a URL Maybe it isn't markdown's place to do this. And sure, I could write a perl script that would do this, but it seemed to me that it was something that markdown *could* do. And if it make life generally easier for Markdown users, it may be a valuable enhancement. To make that perl script robust, I have to maintain yet another tree of files. The tt2 files, the markdown post process files, and the final web site files. (If I used the perl script to replace the token in place in the tt2 files, then I couldn't change it later. If I wanted markdown to ignore the links, then I'd have to invent a separate marking system, and process it after markdown finished with it. This saves the extra file tree, but it means I have to duplicate a significant fraction of the parsing functions of markdown. Don't get me wrong. I'm not desparate for this at this time. But I hear rumours of a new version of markdown in the pipe. This seems to me to be worth considering. From sgbotsford at gmail.com Sun Apr 20 02:11:00 2008 From: sgbotsford at gmail.com (Sherwood Botsford) Date: Sun, 20 Apr 2008 00:11:00 -0600 Subject: Feature Request External label resolution In-Reply-To: <16913-21083@sneakemail.com> References: <480A7226.2000904@gmail.com> <16913-21083@sneakemail.com> Message-ID: <480ADE74.20609@gmail.com> Allan Odgaard wrote: > On 20 Apr 2008, at 00:28, Sherwood Botsford wrote: > >> [...] >> Suppose that markdown was clever enough to reference an external >> file (in .markdownrc of course) for the resolution of LABEL. > > Here?s a simple shell script to convert all markdown to HTML and using a > shared references file: > > cd ~/MySite/pages > > for f in *.mdown; do > cat "$f" references|Markdown.pl > "../html/${f%.mdown}.html" > done > > I use something like that myself where I also have a command to update > my references list, that is, grep through the pages for undefined > references and add these to the references file (where I will then need > to add the URI). > Hmm. Ok, I think I could to it this way in Template Toolkit: [% INCLUDE Header.inc %] ]% USE Markdown %] [% FILTER Markdown %]
blah blah blah
[% INSERT References.inc %] [% END %] [% INCLUDE Footer.inc %] Downside is then that Markdown has to process every reference which if you have a few thousand will be time consuming. (I have an INSERT text at one point that jsut in template toolkit takes 10-12 seconds to do. It's only a thousand lines, but there is zero processing to do, it just stuffs it into the file. From bobtfish at bobtfish.net Sun Apr 20 04:54:01 2008 From: bobtfish at bobtfish.net (Tomas Doran) Date: Sun, 20 Apr 2008 09:54:01 +0100 Subject: Feature Request External label resolution In-Reply-To: <480ADE74.20609@gmail.com> References: <480A7226.2000904@gmail.com> <16913-21083@sneakemail.com> <480ADE74.20609@gmail.com> Message-ID: <102C5842-04EB-4561-B910-FE00A4A7E871@bobtfish.net> On 20 Apr 2008, at 07:11, Sherwood Botsford wrote: > Hmm. Ok, I think I could to it this way in Template Toolkit: > Downside is then that Markdown has to process every reference > which if you have a few thousand will be time consuming. > (I have an INSERT text at one point that jsut in template toolkit > takes 10-12 seconds to do. It's only a thousand lines, but there > is zero processing to do, it just stuffs it into the file. If you're prepared to have a fiddle with perl, have a look at Text::Markdown (on CPAN).. You can do this reasonably trivially - here is a ghetto version of Markdown.pl which will suck all your link references out of ~/.markdownrc: use strict; use warnings; use Text::Markdown; use File::Slurp; use File::HomeDir; use Path::Class; my $m = Text::Markdown->new; my $urls; { my $markdownrc = read_file(file(File::HomeDir->my_home, '.markdownrc')); $m->markdown($markdownrc); $urls = $m->{_urls}; } print $m->markdown(<>, {urls => $urls}); That should do what you originally asked for - adjust to taste.. If you wanted to add an accessor for the URLs hashref on the module, and an option to the Markdown.pl script that I bundle in the distribution to give it this functionality, I'd be very happy to take a patch. :_) Cheers Tom From jgm at berkeley.edu Sun Apr 20 10:40:26 2008 From: jgm at berkeley.edu (John MacFarlane) Date: Sun, 20 Apr 2008 07:40:26 -0700 Subject: Feature Request External label resolution In-Reply-To: <480A7226.2000904@gmail.com> References: <480A7226.2000904@gmail.com> Message-ID: <20080420144026.GA6516@berkeley.edu> Pandoc concatenates input from all files specified on the command line. So you can just do: pandoc myfile.txt refs.txt > myfile.html Seems to me that this would be a reasonable default behavior for Markdown.pl as well, but it doesn't seem to work that way now. John +++ Sherwood Botsford [Apr 19 08 16:28 ]: > One of the things I'm coming up against. Maintaining a non-small > web site with many internal links is a pain. > > Consider: > > Suppose that at one point I have > > site/ > Images > Business > Home > ... > > Later the site gets more complex, and Images has a bunch of sub > directories. > site/ > Images > header_rotate > inventory_pix > misc > Business > Home > > When this happens I have to change the link for every pic > on every page. If I use an image in 6 places, I have to > change it in 6 places. > > HOWEVER > > Suppose I cleverly used the footnote form of links. > > I.e: > > [Image alt text][LABEL] > > Suppose that markdown was clever enough to reference an external > file (in .markdownrc of course) for the resolution of LABEL. > > NOW when I re-arrange the universe, I only have to change the reference > in this one file, NOT in every file that references it. > > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss > From fletcher at fletcherpenney.net Sun Apr 20 10:59:42 2008 From: fletcher at fletcherpenney.net (Fletcher T. Penney) Date: Sun, 20 Apr 2008 10:59:42 -0400 Subject: Feature Request External label resolution In-Reply-To: <20080420144026.GA6516@berkeley.edu> References: <480A7226.2000904@gmail.com> <20080420144026.GA6516@berkeley.edu> Message-ID: You could do something like: cat myfile.txt refs.txt | MultiMarkdown.pl > myfile.html F- On Apr 20, 2008, at 10:40 AM, John MacFarlane wrote: > Pandoc concatenates input from all files specified on the command > line. So you can just do: > > pandoc myfile.txt refs.txt > myfile.html > > Seems to me that this would be a reasonable default behavior for > Markdown.pl as well, but it doesn't seem to work that way now. > > John -- Fletcher T. Penney fletcher at fletcherpenney.net If God dropped acid, would he see people? - Steven Wright -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2437 bytes Desc: not available Url : From sgbotsford at gmail.com Sun Apr 20 11:47:26 2008 From: sgbotsford at gmail.com (Sherwood Botsford) Date: Sun, 20 Apr 2008 09:47:26 -0600 Subject: Feature Request External label resolution In-Reply-To: References: <480A7226.2000904@gmail.com> <20080420144026.GA6516@berkeley.edu> Message-ID: <480B658E.3020804@gmail.com> Fletcher T. Penney wrote: > You could do something like: > > cat myfile.txt refs.txt | MultiMarkdown.pl > myfile.html > > > F- > > On Apr 20, 2008, at 10:40 AM, John MacFarlane wrote: >> Pandoc concatenates input from all files specified on the command >> line. So you can just do: >> >> pandoc myfile.txt refs.txt > myfile.html >> >> Seems to me that this would be a reasonable default behavior for >> Markdown.pl as well, but it doesn't seem to work that way now. >> >> John > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Markdown-Discuss mailing list > Markdown-Discuss at six.pairlist.net > http://six.pairlist.net/mailman/listinfo/markdown-discuss I've had several people who have suggested this as a valid solution. It makes sense if you have a few dozen links. And if I was sufficiently clever, I could break the site down so that references were in a bunch of refs.txt files, and no refs.txt file would have mroe than a few dozen links. However I don't know how to partition my site in such a way. If refs.txt has 2000 links in it, and every file has to parse the entire refs file, it takes a long time. As the site grows, processing time will grow quadratically. (If I double the number of files, it will also double the number of links. So a site that is twice as big will have also have twice as many links. At present I have a small site with 117 pages and 183 links. Given that In a year I figure the link count will be over a thousand. At present I have one file that has a 1000 line table (including all the tags on separate lines) When TemplateToolkit / markdown hit this file there is a 10 - 15 second pause. Since TT is doing it as an INSERT, not as an INCLUDE, TT isn't even looking at the file, so I think it's Markdown scanning this file looking for tags that is causing the delay. If a thousand line file with no partial matches is slowing down Markdown this much, I would expect that a file with 997 non-matching label/url lines and 3 matching label/url lines would cause considerably greater delay. So perhaps I should ask a more general question: How do you deal with large numbers of links? From mail at milianw.de Sun Apr 20 11:56:01 2008 From: mail at milianw.de (Milian Wolff) Date: Sun, 20 Apr 2008 17:56:01 +0200 Subject: Feature Request External label resolution In-Reply-To: <480B658E.3020804@gmail.com> References: <480A7226.2000904@gmail.com> <480B658E.3020804@gmail.com> Message-ID: <200804201756.05659.mail@milianw.de> Am Sonntag, 20. April 2008 schrieb Sherwood Botsford: > So perhaps I should ask a more general question: You build your whole page with Markdown? No underlaying CMS or similar? Especially if you consider that many links you really should think about installing a CMS. > How do you deal with large numbers of links? I use Drupal and write the contents in Markdown. But Navigation, related Links, Tags etc. are generated by Drupal - no Markdown there. The basic templates don't use Markdown either. -- Milian Wolff http://milianw.de OpenPGP key: CD1D1393 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. Url : From sgbotsford at gmail.com Sun Apr 20 12:47:04 2008 From: sgbotsford at gmail.com (Sherwood Botsford) Date: Sun, 20 Apr 2008 10:47:04 -0600 Subject: Feature Request External label resolution In-Reply-To: <200804201756.05659.mail@milianw.de> References: <480A7226.2000904@gmail.com> <480B658E.3020804@gmail.com> <200804201756.05659.mail@milianw.de> Message-ID: <480B7388.30702@gmail.com> Milian Wolff wrote: > Am Sonntag, 20. April 2008 schrieb Sherwood Botsford: >> So perhaps I should ask a more general question: > > You build your whole page with Markdown? No underlaying CMS or similar? > Especially if you consider that many links you really should think about > installing a CMS. > >> How do you deal with large numbers of links? > > I use Drupal and write the contents in Markdown. But Navigation, related > Links, Tags etc. are generated by Drupal - no Markdown there. The basic > templates don't use Markdown either. > > Currently I use template toolkit. Markdown is used as a filter with tt2. The navigation is handled in two stages: 1. A script called file_index runs find on my template directory. It then reads in a file of sequence overrides. (The default is alphabetical order, which is fine for 90% of the files. The overrides allow me to change this, or to mark regexes to be ignored. At the end of this, I have an index file of all pages that are to have menu items that is sorted in the order that menu items should appear. 2. A template include does the actual menu build, whatever parts of the menu show for that page in the order they are in for that page. The menu at any given time shows the current page/folder and it's parents as being open, but all other folders are pruned. Net result: Even though there are a hundred plus pages in the site, the menu never shows more than about 15-20 items, and the present location is always indicated in a different color. So menu is also my breadcrumb trail. (My pet peeve is sites where you chase links in circles.) Index files for each directory are of the form /Foo/Bar/Bar.html That is, Directory Bar has a file Bar.html. This has some advantages for the site growing. E.g. Initially I have Trees/Poplars.html When I get enough time to write the individual pages for the 5 kinds of poplars, this becomes Trees/Poplars/Poplars.html Trees/Poplars/Balsam_Poplar.html Trees/... Rerun file_index, rerun Ttree, and the links work. I really like markdown, because it doesn't get much in my way. I want to keep markdown. Typically for me to create another page requires about 2 minutes plus the time to actually write the page. I may end up hacking markdown and disabling the link resolution mechanism, and create my own link resolution system as another ttree filter. Or hacking markdown to do a db lookup instead of a linear search through the file. So Markdown at present only sees the links that are in my content pages. And so right now there are only a dozen or so. But as my site grows, the importance of jumping from page to page via internal references will increase dramatically. I'll take a look at Drupal. From pagaltzis at gmx.de Sun Apr 20 12:50:26 2008 From: pagaltzis at gmx.de (Aristotle Pagaltzis) Date: Sun, 20 Apr 2008 18:50:26 +0200 Subject: Feature Request External label resolution In-Reply-To: <102C5842-04EB-4561-B910-FE00A4A7E871@bobtfish.net> References: <480A7226.2000904@gmail.com> <16913-21083@sneakemail.com> <480ADE74.20609@gmail.com> <102C5842-04EB-4561-B910-FE00A4A7E871@bobtfish.net> Message-ID: <20080420165026.GG25845@klangraum.plasmasturm.org> * Tomas Doran [2008-04-20 10:55]: > If you wanted to add an accessor for the URLs hashref on the > module, and an option to the Markdown.pl script that I bundle > in the distribution to give it this functionality, I'd be very > happy to take a patch. :_) That would have been great to have for me a while ago. I had to ditch Markdown in favour of WYSIWYG anyway, so it?s no longer an itch for me. But if we were still using it, I?d want such an accessor: at this point we?d be appending several thousand link references to every page, which is not exactly cheap. Regards, -- Aristotle Pagaltzis // From mail at milianw.de Sun Apr 20 13:01:54 2008 From: mail at milianw.de (Milian Wolff) Date: Sun, 20 Apr 2008 19:01:54 +0200 Subject: Feature Request External label resolution In-Reply-To: <480B7388.30702@gmail.com> References: <480A7226.2000904@gmail.com> <200804201756.05659.mail@milianw.de> <480B7388.30702@gmail.com> Message-ID: <200804201901.56598.mail@milianw.de> Am Sonntag, 20. April 2008 schrieb Sherwood Botsford: > Milian Wolff wrote: > > Am Sonntag, 20. April 2008 schrieb Sherwood Botsford: > >> So perhaps I should ask a more general question: > > > > You build your whole page with Markdown? No underlaying CMS or similar? > > Especially if you consider that many links you really should think about > > installing a CMS. > > > >> How do you deal with large numbers of links? > > > > I use Drupal and write the contents in Markdown. But Navigation, related > > Links, Tags etc. are generated by Drupal - no Markdown there. The basic > > templates don't use Markdown either. > > Currently I use template toolkit. Markdown is used as a filter > with tt2. > > The navigation is handled in two stages: > > 1. A script called file_index runs find on my template > directory. It then reads in a file of sequence overrides. > (The default is alphabetical order, which is fine for 90% > of the files. The overrides allow me to change this, or to > mark regexes to be ignored. > > At the end of this, I have an index file of all pages that > are to have menu items that is sorted in > the order that menu items should appear. > > 2. A template include does the actual menu build, whatever > parts of the menu show for that page in the order they are > in for that page. The menu at any given time shows the current > page/folder and it's parents as being open, but all other > folders are pruned. Net result: Even though there are a hundred > plus pages in the site, the menu never shows more than about > 15-20 items, and the present location is always indicated > in a different color. So menu is also my breadcrumb trail. > (My pet peeve is sites where you chase links in circles.) > > > Index files for each directory are of the form > > /Foo/Bar/Bar.html That is, Directory Bar has a file Bar.html. > > This has some advantages for the site growing. > > E.g. Initially I have Trees/Poplars.html > > When I get enough time to write the individual pages for the > 5 kinds of poplars, this becomes > > Trees/Poplars/Poplars.html > Trees/Poplars/Balsam_Poplar.html > Trees/... > > Rerun file_index, rerun Ttree, and the links work. > > I really like markdown, because it doesn't get much in my > way. I want to keep markdown. Typically for me to create another > page requires about 2 minutes plus the time to actually write the > page. This is a very interesting way of creating a website. Drupal and other CMS utilize PHP or other programming languages to keep things customizable and dynamic. Of course this requires a webserver with support for PHP or similar, most often a database like MySQL and so forth. Since you already have a working system with your setup you should do what takes the least amount of work. And this would be: > I may end up hacking markdown and disabling the link resolution > mechanism, and create my own link resolution system as another > ttree filter. Or hacking markdown to do a db lookup instead of > a linear search through the file. Yes that seems to be a good idea. Though I recon you could pick one of the Markdown implementations with support for extensions or one which is object orientated so you could simply overload the basic link resolution. Should be cleaner than hacking Markdown.pl > So Markdown at present only sees the links that are in my content > pages. And so right now there are only a dozen or so. But > as my site grows, the importance of jumping from page to page > via internal references will increase dramatically. > > I'll take a look at Drupal. It's learning curve might be a bit steep though I think it is worth it. Maybe try it out for your next website. -- Milian Wolff http://milianw.de OpenPGP key: CD1D1393 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. Url : From sgbotsford at gmail.com Sun Apr 20 13:25:27 2008 From: sgbotsford at gmail.com (Sherwood Botsford) Date: Sun, 20 Apr 2008 11:25:27 -0600 Subject: Feature Request External label resolution In-Reply-To: <200804201901.56598.mail@milianw.de> References: <480A7226.2000904@gmail.com> <200804201756.05659.mail@milianw.de> <480B7388.30702@gmail.com> <200804201901.56598.mail@milianw.de> Message-ID: <480B7C87.6090304@gmail.com> Milian Wolff wrote: > Yes that seems to be a good idea. Though I recon you could pick one of the > Markdown implementations with support for extensions or one which is object > orientated so you could simply overload the basic link resolution. Should be > cleaner than hacking Markdown.pl > >> Umm. Pointers to Markdown implementations with support for extensions? > It's learning curve might be a bit steep though I think it is worth it. Maybe > try it out for your next website. > I took a quick look at drupal.org. Won't happen this spring. While we still have snow on the ground, my busy season for the tree farm starts in two weeks. As soon as snow is gone I'll be busier than a one handed piano player, with 4500 seedlings coming in. So until next winter, I'll live with what I have. From mail at milianw.de Sun Apr 20 13:29:42 2008 From: mail at milianw.de (Milian Wolff) Date: Sun, 20 Apr 2008 19:29:42 +0200 Subject: Feature Request External label resolution In-Reply-To: <480B7C87.6090304@gmail.com> References: <480A7226.2000904@gmail.com> <200804201901.56598.mail@milianw.de> <480B7C87.6090304@gmail.com> Message-ID: <200804201929.47260.mail@milianw.de> Am Sonntag, 20. April 2008 schrieb Sherwood Botsford: > Milian Wolff wrote: > > Yes that seems to be a good idea. Though I recon you could pick one of > > the Markdown implementations with support for extensions or one which is > > object orientated so you could simply overload the basic link resolution. > > Should be cleaner than hacking Markdown.pl > > Umm. Pointers to Markdown implementations with support for > extensions? I'm only acquainted with PHP Markdown. It would be possible there. -- Milian Wolff http://milianw.de OpenPGP key: CD1D1393 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. Url : From michel.fortin at michelf.com Sun Apr 20 15:40:24 2008 From: michel.fortin at michelf.com (Michel Fortin) Date: Sun, 20 Apr 2008 15:40:24 -0400 Subject: Feature Request External label resolution In-Reply-To: <480B658E.3020804@gmail.com> References: <480A7226.2000904@gmail.com> <20080420144026.GA6516@berkeley.edu> <480B658E.3020804@gmail.com> Message-ID: <0EE94AD3-D8B4-468B-8444-F152BC7AA210@michelf.com> Le 2008-04-20 ? 11:47, Sherwood Botsford a ?crit : > How do you deal with large numbers of links? If you were using PHP Markdown, You could write a little extension like this: class PreLinkedMarkdown_Parser extends Markdown_Parser { var $preurls = array(); var $pretitles = array(); function PreLinkedMarkdown_Parser($preurls = array(), $pretitles = array()) { parent::Markdown_Parser(); $this->$preurls = $preurls; $this->pretitles = $pretitles; $this->document_gamut["fillPreURLs"] = -999; } function fillPreURLs($text) { $this->urls = $this->preurls; $this->titles = $this->pretitles; return $text; } } and use your parser like this: // read you link file once and populate arrays this way: $preurls = array('link-ref' => 'url'); $pretitles = array('link-ref' => 'title'); // optional $parser = new PreLinkedMarkdown_Parser($preurls, $prelinks); $html = $parser->transform($text); (Note that I haven't tested any of this code) If there is enough interest, I could add this feature to the regular PHP Markdown Parser class. Michel Fortin michel.fortin at michelf.com http://michelf.com/ From waylan at gmail.com Sun Apr 20 16:35:39 2008 From: waylan at gmail.com (Waylan Limberg) Date: Sun, 20 Apr 2008 16:35:39 -0400 Subject: Feature Request External label resolution In-Reply-To: <480B7C87.6090304@gmail.com> References: <480A7226.2000904@gmail.com> <200804201756.05659.mail@milianw.de> <480B7388.30702@gmail.com> <200804201901.56598.mail@milianw.de> <480B7C87.6090304@gmail.com> Message-ID: On Sun, Apr 20, 2008 at 1:25 PM, Sherwood Botsford wrote: > > Umm. Pointers to Markdown implementations with support for extensions? > > [Python-Markdown][1] has an [extension api][2]. Interestingly my [Abbreviation Extension][3] for Python-Markdown does something similar for abbreviations. Given an external file, it pulls all abbr defs and uses them in processing the source file. It wouldn't be hard at all to adapt it for link defs. What I really like about it is that the external file can be any text document (markdown or otherwise) and the limited parser will only extract the abbr defs (one per line) from it. [1]: http://www.freewisdom.org/projects/python-markdown/ [2]: http://www.freewisdom.org/projects/python-markdown/Writing_Extensions [3]: http://achinghead.com/markdown/abbr/ -- ---- Waylan Limberg waylan at gmail.com