Popular Threads From Christiansource:
List Statistics
- Total Threads: 420
- Total Posts: 358
Phrases Used to Find This Thread
|
# 1

08-07-2011 03:10 AM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
# 2

08-07-2011 03:59 AM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
|
# 3

08-07-2011 01:15 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
# 4

08-07-2011 03:12 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
# 5

08-07-2011 03:56 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
|
# 6

08-07-2011 05:18 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
# 7

08-07-2011 05:22 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Ed Hurst wrote:
> On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>
>>> my $all = join( "\n", @slurp );
>>>
>> Combine all of the lines into one long string. We need this to check for
>> the
>> pattern across lines. This is where Perl has the advantage over sed.
>
> I really appreciate your patience.
>
> Once the lines are all "slurped" together, are they put back as they
> were, or do I have to put all the linebreaks back in manually? For
> maintenance purposes, I keep my HTML files at 72 characters.
Heh. It did indeed put the linebreaks back in, but also left the new
file double-spaced. That is, a new empty line between each previous
line.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
# 8

08-07-2011 07:12 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Ed Hurst wrote:
> On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>
>>> my $all = join( "\n", @slurp );
>>>
>> Combine all of the lines into one long string. We need this to check for
>> the
>> pattern across lines. This is where Perl has the advantage over sed.
>
> I really appreciate your patience.
>
> Once the lines are all "slurped" together, are they put back as they
> were, or do I have to put all the linebreaks back in manually? For
> maintenance purposes, I keep my HTML files at 72 characters.
Heh. It did indeed put the linebreaks back in, but also left the new
file double-spaced. That is, a new empty line between each previous
line.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>
> Heh. It did indeed put the linebreaks back in, but also left the new
> file double-spaced. That is, a new empty line between each previous
>
> line.
>
Oops. That's should be easy enough to fix. Find the line that says my $all =
join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
*join* command added the extra newline.
--
Robert Wohlfarth
|
# 9

08-07-2011 08:01 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Ed Hurst wrote:
> On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>
>>> my $all = join( "\n", @slurp );
>>>
>> Combine all of the lines into one long string. We need this to check for
>> the
>> pattern across lines. This is where Perl has the advantage over sed.
>
> I really appreciate your patience.
>
> Once the lines are all "slurped" together, are they put back as they
> were, or do I have to put all the linebreaks back in manually? For
> maintenance purposes, I keep my HTML files at 72 characters.
Heh. It did indeed put the linebreaks back in, but also left the new
file double-spaced. That is, a new empty line between each previous
line.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>
> Heh. It did indeed put the linebreaks back in, but also left the new
> file double-spaced. That is, a new empty line between each previous
>
> line.
>
Oops. That's should be easy enough to fix. Find the line that says my $all =
join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
*join* command added the extra newline.
--
Robert Wohlfarth
Intensely interesting conversation! If I want to learn something about
simple programing is perl a good language to pursue? If so, can you point
me to a good place or good documentation to get started?
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>>
>> line.
>>
>
> Oops. That's should be easy enough to fix. Find the line that says my $all
> =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> The
> *join* command added the extra newline.
>
> --
> Robert Wohlfarth
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
# 10

08-07-2011 09:59 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Ed Hurst wrote:
> On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>
>>> my $all = join( "\n", @slurp );
>>>
>> Combine all of the lines into one long string. We need this to check for
>> the
>> pattern across lines. This is where Perl has the advantage over sed.
>
> I really appreciate your patience.
>
> Once the lines are all "slurped" together, are they put back as they
> were, or do I have to put all the linebreaks back in manually? For
> maintenance purposes, I keep my HTML files at 72 characters.
Heh. It did indeed put the linebreaks back in, but also left the new
file double-spaced. That is, a new empty line between each previous
line.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>
> Heh. It did indeed put the linebreaks back in, but also left the new
> file double-spaced. That is, a new empty line between each previous
>
> line.
>
Oops. That's should be easy enough to fix. Find the line that says my $all =
join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
*join* command added the extra newline.
--
Robert Wohlfarth
Intensely interesting conversation! If I want to learn something about
simple programing is perl a good language to pursue? If so, can you point
me to a good place or good documentation to get started?
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>>
>> line.
>>
>
> Oops. That's should be easy enough to fix. Find the line that says my $all
> =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> The
> *join* command added the extra newline.
>
> --
> Robert Wohlfarth
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>> line.
>
> Oops. That's should be easy enough to fix. Find the line that says my $all =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
> *join* command added the extra newline.
That got it. Thank you so much!
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
# 11

08-07-2011 11:52 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Ed Hurst wrote:
> On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>
>>> my $all = join( "\n", @slurp );
>>>
>> Combine all of the lines into one long string. We need this to check for
>> the
>> pattern across lines. This is where Perl has the advantage over sed.
>
> I really appreciate your patience.
>
> Once the lines are all "slurped" together, are they put back as they
> were, or do I have to put all the linebreaks back in manually? For
> maintenance purposes, I keep my HTML files at 72 characters.
Heh. It did indeed put the linebreaks back in, but also left the new
file double-spaced. That is, a new empty line between each previous
line.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>
> Heh. It did indeed put the linebreaks back in, but also left the new
> file double-spaced. That is, a new empty line between each previous
>
> line.
>
Oops. That's should be easy enough to fix. Find the line that says my $all =
join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
*join* command added the extra newline.
--
Robert Wohlfarth
Intensely interesting conversation! If I want to learn something about
simple programing is perl a good language to pursue? If so, can you point
me to a good place or good documentation to get started?
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>>
>> line.
>>
>
> Oops. That's should be easy enough to fix. Find the line that says my $all
> =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> The
> *join* command added the extra newline.
>
> --
> Robert Wohlfarth
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>> line.
>
> Oops. That's should be easy enough to fix. Find the line that says my $all =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
> *join* command added the extra newline.
That got it. Thank you so much!
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
I cannot speak for Perl, but Python has been fairly easy for me to pick up,
and is frequently used for text processing, as well as web development. I
wouldn't be of much help in that direction, other than to say that the folks
on the Python newbie mailing list are generally pretty accessible.
Perhaps someone with knowledge of both can offer some thoughts on this?
Don
On Fri, Jul 8, 2011 at 15:01, <> wrote:
> Intensely interesting conversation! If I want to learn something about
> simple programing is perl a good language to pursue? If so, can you point
> me to a good place or good documentation to get started?
>
> > On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
> >>
> >> Heh. It did indeed put the linebreaks back in, but also left the new
> >> file double-spaced. That is, a new empty line between each previous
> >>
> >> line.
> >>
> >
> > Oops. That's should be easy enough to fix. Find the line that says my
> $all
> > =
> > join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> > The
> > *join* command added the extra newline.
> >
> > --
> > Robert Wohlfarth
> > _______________________________________________
> > ChristianSource FSLUG mailing list
> >
> > http://cs.uninetsolutions.com
>
>
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
--
D.C. Parris, FMP, LEED AP O+M, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
https://www.xing.com/profile/Don_Parris |
http://www.linkedin.com/in/dcparris
GPG Key ID: F5E179BE
|
# 12

09-07-2011 05:05 PM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Ed Hurst wrote:
> On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>
>>> my $all = join( "\n", @slurp );
>>>
>> Combine all of the lines into one long string. We need this to check for
>> the
>> pattern across lines. This is where Perl has the advantage over sed.
>
> I really appreciate your patience.
>
> Once the lines are all "slurped" together, are they put back as they
> were, or do I have to put all the linebreaks back in manually? For
> maintenance purposes, I keep my HTML files at 72 characters.
Heh. It did indeed put the linebreaks back in, but also left the new
file double-spaced. That is, a new empty line between each previous
line.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>
> Heh. It did indeed put the linebreaks back in, but also left the new
> file double-spaced. That is, a new empty line between each previous
>
> line.
>
Oops. That's should be easy enough to fix. Find the line that says my $all =
join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
*join* command added the extra newline.
--
Robert Wohlfarth
Intensely interesting conversation! If I want to learn something about
simple programing is perl a good language to pursue? If so, can you point
me to a good place or good documentation to get started?
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>>
>> line.
>>
>
> Oops. That's should be easy enough to fix. Find the line that says my $all
> =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> The
> *join* command added the extra newline.
>
> --
> Robert Wohlfarth
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>> line.
>
> Oops. That's should be easy enough to fix. Find the line that says my $all =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
> *join* command added the extra newline.
That got it. Thank you so much!
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
I cannot speak for Perl, but Python has been fairly easy for me to pick up,
and is frequently used for text processing, as well as web development. I
wouldn't be of much help in that direction, other than to say that the folks
on the Python newbie mailing list are generally pretty accessible.
Perhaps someone with knowledge of both can offer some thoughts on this?
Don
On Fri, Jul 8, 2011 at 15:01, <> wrote:
> Intensely interesting conversation! If I want to learn something about
> simple programing is perl a good language to pursue? If so, can you point
> me to a good place or good documentation to get started?
>
> > On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
> >>
> >> Heh. It did indeed put the linebreaks back in, but also left the new
> >> file double-spaced. That is, a new empty line between each previous
> >>
> >> line.
> >>
> >
> > Oops. That's should be easy enough to fix. Find the line that says my
> $all
> > =
> > join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> > The
> > *join* command added the extra newline.
> >
> > --
> > Robert Wohlfarth
> > _______________________________________________
> > ChristianSource FSLUG mailing list
> >
> > http://cs.uninetsolutions.com
>
>
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
--
D.C. Parris, FMP, LEED AP O+M, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
https://www.xing.com/profile/Don_Parris |
http://www.linkedin.com/in/dcparris
GPG Key ID: F5E179BE
On Fri, Jul 8, 2011 at 2:01 PM, <> wrote:
> Intensely interesting conversation! If I want to learn something about
> simple programing is perl a good language to pursue? If so, can you point
> me to a good place or good documentation to get started?
Yeah, the geek in me loves this type of discussion too.
Yes, Perl is a good language to start. "Perl5 for Dummies" provides the
basics. And it's a little easier to follow than reading the documentation
cold.
A scripting language is a good first choice to start programming. Perl is a
scripting language. Python too. Scripting languages are more forgiving,
handle some of the thorny details for you, and are quick to get started.
Once you're comfortable with Perl, I would also recommend studying unix
shell scripting. It provides a nice contrast, seeing how the two differ.
--
Robert Wohlfarth
|
# 13

10-07-2011 05:11 AM
|
|
|
I assume sed is not necessarily the proper tool for this task, or
perhaps not the whole task. After re-reading the most recent discussion
on sed, I was hoping to accomplish something a bit more complicated for
a similar context.
On my site, I typically include on each page a header meta line which
identifies who produced the file. I chose the term "formator" because I
don't own the content of all the files, but I do have permission to use
it. At any rate, it prevents spamming while permitting some folks to
contact me. Each of these has a deconstructed email address which is no
longer valid "somebody at such dot com". The problem with using sed is
the lines are broken, so that the input pattern has a newline break in
it:
That's a literal paste. It's not necessary to keep the line break, but I
need to replace the whole address with 'eddie at soulkiln dot org'. I
estimate there are over a hundred files in the archive in several
different directories and subdirectories. Not all of them have this old
address; some have yet another invalid address.
After reading the manpage and some tutorials, I don't quite get how to
persuade sed to read past the newline. The examples I've found are all
far more complex than I need. I may someday learn those things, but
right now is not the best time to take an extended course in sed.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, Jul 7, 2011 at 9:10 PM, Ed Hurst <> wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent discussion
> on sed, I was hoping to accomplish something a bit more complicated for
> a similar context.
>
You could try this little Perl script:
foreach my $file (@ARGV) {
open my $in, "<$file";
my @slurp = <$in>;
close $in;
my $all = join( "\n", @slurp );
$all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
open my $out, ">$file.new";
print $out $all;
close $out;
}
--
Robert Wohlfarth
I did not try Robert's perl script, but I will say that Perl has the
ability to match across lines while sed does not. In Sed, you need
to use hold-spaces and other complex things to get the job done. In
Perl, you put the "m" (or "s", which is similar) at the end of the
pattern replacement string and it will do the search/replace across
newlines. So, in this case, I would say that using Perl is the way
to go.
- Tim Young
On 7/7/2011 9:10 PM, Ed Hurst wrote:
> I assume sed is not necessarily the proper tool for this task, or
> perhaps not the whole task. After re-reading the most recent
> discussion
> on sed, I was hoping to accomplish something a bit more complicated
> for
> a similar context.
>
> On my site, I typically include on each page a header meta line which
> identifies who produced the file. I chose the term "formator"
> because I
> don't own the content of all the files, but I do have permission to
> use
> it. At any rate, it prevents spamming while permitting some folks to
> contact me. Each of these has a deconstructed email address which
> is no
> longer valid "somebody at such dot com". The problem with using sed is
> the lines are broken, so that the input pattern has a newline break in
> it:
>
>
>
> That's a literal paste. It's not necessary to keep the line break,
> but I
> need to replace the whole address with 'eddie at soulkiln dot org'. I
> estimate there are over a hundred files in the archive in several
> different directories and subdirectories. Not all of them have this
> old
> address; some have yet another invalid address.
>
> After reading the manpage and some tutorials, I don't quite get how to
> persuade sed to read past the newline. The examples I've found are all
> far more complex than I need. I may someday learn those things, but
> right now is not the best time to take an extended course in sed.
>
> Ed Hurst
> --------
> Open for Business - http://ofb.biz/
> Kiln of the Soul - http://soulkiln.org/
> blog - http://soulkiln.myopera.com/
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Thu, 7 Jul 2011, Robert Wohlfarth wrote:
> You could try this little Perl script:
>
> foreach my $file (@ARGV) {
> open my $in, "<$file";
> my @slurp = <$in>;
> close $in;
> my $all = join( "\n", @slurp );
> $all =~ s/softedges\s*at\s*softhome\s*dot\s*net/eddie at soulkiln dot org/m;
> open my $out, ">$file.new";
> print $out $all;
> close $out;
> }
Having never messed with Perl, I can scarcely begin to parse this. What
I need now is how to tell this script to check all the HTML files in a
given directory. I saved it as change.pl and learned how to check the
syntax. It's fine, but it does nothing as it stands.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 9:12 AM, Ed Hurst <> wrote:
> Having never messed with Perl, I can scarcely begin to parse this. What
> I need now is how to tell this script to check all the HTML files in a
> given directory. I saved it as change.pl and learned how to check the
> syntax. It's fine, but it does nothing as it stands.
Sorry about that, Ed. Some documentation would be useful...
Okay, to change the HTML files run the script like this: perl change.pl*.html
You will end up with a directory full of files that end with .html.new.
Check a couple of the .html.new files. I never really trust my own scripts
and always want to eyeball the results first. Once you're satisfied that the
*.html.new files look okay, rename them to .html.
How the script works...
> foreach my $file (@ARGV) {
>
Loop through all of the files passed on the command line.
> open my $in, "<$file";
>
Open the file for reading.
> my @slurp = <$in>;
> close $in;
>
Read the entire file into memory, split apart by lines. At this point, it's
just like sed.
> my $all = join( "\n", @slurp );
>
Combine all of the lines into one long string. We need this to check for the
pattern across lines. This is where Perl has the advantage over sed.
> $all =~ s/softedges\s*at\s*softhome\s***dot\s*net/eddie at soulkiln dot
> org/m;
>
Replace the text. The *\s** matches any whitespace - including a newline.
> open my $out, ">$file.new";
> print $out $all;
> close $out;
>
Write the altered contents out to a different file. Overwriting the
originals makes me nervous. I like to eyeball the changes before committing.
> }
>
--
Robert Wohlfarth
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>> my $all = join( "\n", @slurp );
>>
> Combine all of the lines into one long string. We need this to check for the
> pattern across lines. This is where Perl has the advantage over sed.
I really appreciate your patience.
Once the lines are all "slurped" together, are they put back as they
were, or do I have to put all the linebreaks back in manually? For
maintenance purposes, I keep my HTML files at 72 characters.
I'm doing this on a copy of the backed up site on my machine before I
even try it on the hosting machine.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Ed Hurst wrote:
> On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
>
>>> my $all = join( "\n", @slurp );
>>>
>> Combine all of the lines into one long string. We need this to check for
>> the
>> pattern across lines. This is where Perl has the advantage over sed.
>
> I really appreciate your patience.
>
> Once the lines are all "slurped" together, are they put back as they
> were, or do I have to put all the linebreaks back in manually? For
> maintenance purposes, I keep my HTML files at 72 characters.
Heh. It did indeed put the linebreaks back in, but also left the new
file double-spaced. That is, a new empty line between each previous
line.
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>
> Heh. It did indeed put the linebreaks back in, but also left the new
> file double-spaced. That is, a new empty line between each previous
>
> line.
>
Oops. That's should be easy enough to fix. Find the line that says my $all =
join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
*join* command added the extra newline.
--
Robert Wohlfarth
Intensely interesting conversation! If I want to learn something about
simple programing is perl a good language to pursue? If so, can you point
me to a good place or good documentation to get started?
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>>
>> line.
>>
>
> Oops. That's should be easy enough to fix. Find the line that says my $all
> =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> The
> *join* command added the extra newline.
>
> --
> Robert Wohlfarth
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
On Fri, 8 Jul 2011, Robert Wohlfarth wrote:
> On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
>>
>> Heh. It did indeed put the linebreaks back in, but also left the new
>> file double-spaced. That is, a new empty line between each previous
>> line.
>
> Oops. That's should be easy enough to fix. Find the line that says my $all =
> join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );. The
> *join* command added the extra newline.
That got it. Thank you so much!
Ed Hurst
--------
Open for Business - http://ofb.biz/
Kiln of the Soul - http://soulkiln.org/
blog - http://soulkiln.myopera.com/
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
I cannot speak for Perl, but Python has been fairly easy for me to pick up,
and is frequently used for text processing, as well as web development. I
wouldn't be of much help in that direction, other than to say that the folks
on the Python newbie mailing list are generally pretty accessible.
Perhaps someone with knowledge of both can offer some thoughts on this?
Don
On Fri, Jul 8, 2011 at 15:01, <> wrote:
> Intensely interesting conversation! If I want to learn something about
> simple programing is perl a good language to pursue? If so, can you point
> me to a good place or good documentation to get started?
>
> > On Fri, Jul 8, 2011 at 11:22 AM, Ed Hurst <> wrote:
> >>
> >> Heh. It did indeed put the linebreaks back in, but also left the new
> >> file double-spaced. That is, a new empty line between each previous
> >>
> >> line.
> >>
> >
> > Oops. That's should be easy enough to fix. Find the line that says my
> $all
> > =
> > join( "\n", @slurp );. Change it to read my $all = join( "", @slurp );.
> > The
> > *join* command added the extra newline.
> >
> > --
> > Robert Wohlfarth
> > _______________________________________________
> > ChristianSource FSLUG mailing list
> >
> > http://cs.uninetsolutions.com
>
>
>
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
>
--
D.C. Parris, FMP, LEED AP O+M, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
https://www.xing.com/profile/Don_Parris |
http://www.linkedin.com/in/dcparris
GPG Key ID: F5E179BE
On Fri, Jul 8, 2011 at 2:01 PM, <> wrote:
> Intensely interesting conversation! If I want to learn something about
> simple programing is perl a good language to pursue? If so, can you point
> me to a good place or good documentation to get started?
Yeah, the geek in me loves this type of discussion too.
Yes, Perl is a good language to start. "Perl5 for Dummies" provides the
basics. And it's a little easier to follow than reading the documentation
cold.
A scripting language is a good first choice to start programming. Perl is a
scripting language. Python too. Scripting languages are more forgiving,
handle some of the thorny details for you, and are quick to get started.
Once you're comfortable with Perl, I would also recommend studying unix
shell scripting. It provides a nice contrast, seeing how the two differ.
--
Robert Wohlfarth
Thanks, I don't know when I'll have time to get started on it, but it sure
looks interesting.
> On Fri, Jul 8, 2011 at 2:01 PM, <> wrote:
>
>> Intensely interesting conversation! If I want to learn something about
>> simple programing is perl a good language to pursue? If so, can you
>> point
>> me to a good place or good documentation to get started?
>
>
> Yeah, the geek in me loves this type of discussion too.
>
> Yes, Perl is a good language to start. "Perl5 for Dummies" provides the
> basics. And it's a little easier to follow than reading the documentation
> cold.
>
> A scripting language is a good first choice to start programming. Perl is
> a
> scripting language. Python too. Scripting languages are more forgiving,
> handle some of the thorny details for you, and are quick to get started.
> Once you're comfortable with Perl, I would also recommend studying unix
> shell scripting. It provides a nice contrast, seeing how the two differ.
>
> --
> Robert Wohlfarth
> _______________________________________________
> ChristianSource FSLUG mailing list
>
> http://cs.uninetsolutions.com
_______________________________________________
ChristianSource FSLUG mailing list
http://cs.uninetsolutions.com
)
|
NewsArc Lists
| Culture Pages
| Computing Archive
| Media-Pages
Link to this page on your blog or website by copying the HTML code below and pasting it into your site:
|
|