Discussion:
Confused first time Kate user
(too old to reply)
Richard Owlett
2024-07-02 14:40:32 UTC
Permalink
I have a Debian machine with Kate Version 16.08.3 .

I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html

I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".

The documents give essentially no examples.

Help please.
TIA
candycanearter07
2024-07-02 15:00:04 UTC
Permalink
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
The documents give essentially no examples.
Help please.
TIA
I'd recommend using an online regex generator like
https://regex101.com/.

This regex expression should do what you want:
[[:digit:]]{3}
--
user <candycane> is generated from /dev/urandom
Janis Papanagnou
2024-07-02 15:11:31 UTC
Permalink
Post by candycanearter07
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
The documents give essentially no examples.
Help please.
TIA
I'd recommend using an online regex generator like
https://regex101.com/.
[[:digit:]]{3}
Doesn't that mean _exactly_ 3 digits? (The OP wanted 1-299, which
may be one up to three digits.) Some regexp parsers allow {,3} for
an up-to range (but that might mean 0-3, thus also not the desired
expression). Or you can explicitly specify the digits range {1,3}.

Janis
Richard Owlett
2024-07-03 11:17:56 UTC
Permalink
Post by Janis Papanagnou
Post by candycanearter07
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
The documents give essentially no examples.
Help please.
TIA
I'd recommend using an online regex generator like
https://regex101.com/.
[[:digit:]]{3}
Doesn't that mean _exactly_ 3 digits? (The OP wanted 1-299, which
may be one up to three digits.) Some regexp parsers allow {,3} for
an up-to range (but that might mean 0-3, thus also not the desired
expression). Or you can explicitly specify the digits range {1,3}.
Janis
You correctly interpreted the restrictions I have.
candycanearter07
2024-07-03 13:30:03 UTC
Permalink
Post by Janis Papanagnou
Post by candycanearter07
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
The documents give essentially no examples.
Help please.
TIA
I'd recommend using an online regex generator like
https://regex101.com/.
[[:digit:]]{3}
Doesn't that mean _exactly_ 3 digits? (The OP wanted 1-299, which
may be one up to three digits.) Some regexp parsers allow {,3} for
an up-to range (but that might mean 0-3, thus also not the desired
expression). Or you can explicitly specify the digits range {1,3}.
Janis
Ah right, sorry. I don't use regex that frequently.
--
user <candycane> is generated from /dev/urandom
Richard Owlett
2024-07-03 11:15:14 UTC
Permalink
Post by candycanearter07
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
The documents give essentially no examples.
Help please.
TIA
I'd recommend using an online regex generator like
https://regex101.com/.
Unfortunately it seems your browser does not meet the criteria to
properly render and utilize this website. You need a browser with
support for web workers and Web Assembly.
[[:digit:]]{3}
I suspect that would accept a value of "0".
*ERROR* with results I don't wish to contemplate.
Kenny McCormack
2024-07-03 11:46:26 UTC
Permalink
In article <v63bs4$25m5u$***@dont-email.me>,
Richard Owlett <***@access.net> wrote:
...
Post by Richard Owlett
Post by candycanearter07
[[:digit:]]{3}
I suspect that would accept a value of "0".
*ERROR* with results I don't wish to contemplate.
I suspect that what you want to do actually can't be done (accurately) with
regexps, if we interpret your requirements literally. Most responders so
far have pretty much glossed over your requirements. For example, while
you want to match (and replace) XYZ299, you want to leave XYZ300 alone.

You probably need a programming languages (such as AWK) to do this correctly.

Note, BTW, that the real problem with regexps is that there are so many
different implementations. Supposedly, there is a standard - actually,
multiple standards - but each implementation is subtly different. For
example, sometimes you need \ before special characters like ( or { or ?
and sometimes you don't (depending on which implementation you are using).
--
"If our country is going broke, let it be from feeding the poor and caring for
the elderly. And not from pampering the rich and fighting wars for them."

--Living Blue in a Red State--
Richard Owlett
2024-07-03 12:45:18 UTC
Permalink
Post by Kenny McCormack
...
Post by Richard Owlett
Post by candycanearter07
[[:digit:]]{3}
I suspect that would accept a value of "0".
*ERROR* with results I don't wish to contemplate.
I suspect that what you want to do actually can't be done (accurately) with
regexps, if we interpret your requirements literally. Most responders so
far have pretty much glossed over your requirements.
You noticed < *GRIN* >
Unfortunately that's fairly common on USENET.
I'm used to windowing "wheat" from "chaff".
Post by Kenny McCormack
For example, while
you want to match (and replace) XYZ299, you want to leave XYZ300 alone.
Actually its more the case that for _my_ application XYZ300 and above
physically cannot exist.
Post by Kenny McCormack
You probably need a programming languages (such as AWK) to do this correctly.
No. Further in this thread Janis Papanagnou demonstrated what I needed.
Post by Kenny McCormack
Note, BTW, that the real problem with regexps is that there are so many
different implementations. Supposedly, there is a standard - actually,
multiple standards - but each implementation is subtly different. For
example, sometimes you need \ before special characters like ( or { or ?
and sometimes you don't (depending on which implementation you are using).
Yepp.
That's why my "Subject:" AND first sentence explicitly reference Kate.
Janis Papanagnou
2024-07-02 15:05:40 UTC
Permalink
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
Disclaimer: I don't know the Kate editor. But I know Regular
Expressions (RE).
Post by Richard Owlett
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
You may do that with simple patterns if you don't have, say,
strings like XYZ300 that shall be disregarded. Then the RE
may simply be XYZ[0-9]+ meaning any string XYZ that is
followed by an arbitrary number of digits. Instead you can
specify digits as optional XYZ[0-9][0-9]?[0-9]? or define
the amount of digits (1-3) explicitly XYZ[0-9]{1,3} which
still allows numbers out of range 1..299 (say, 0, 300) or
undesired syntaxes like 00 or 000. - Not sure it matters in
your case. If it matters, you can define alternatives with
a bar-symbol, e.g., XYZ([1-9]|[1-9][0-9]|[1-2][0-9][0-9])
that you group with parenthesis.

Where you put such regular expressions in your Kate editor
is known to you, I suppose?
Post by Richard Owlett
The documents give essentially no examples.
Regular expressions may first appear confusing, but the links
you posted actually has relevant examples.
Post by Richard Owlett
Help please.
TIA
Hope that helps.

Janis
Richard Owlett
2024-07-03 12:22:14 UTC
Permalink
Post by Janis Papanagnou
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
Disclaimer: I don't know the Kate editor. But I know Regular
Expressions (RE).
Post by Richard Owlett
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
You may do that with simple patterns if you don't have, say,
strings like XYZ300 that shall be disregarded.
What I must avoid is it recognizing XYZ0 as a match.
That would create chaos ;/
Post by Janis Papanagnou
Then the RE
may simply be XYZ[0-9]+ meaning any string XYZ that is
followed by an arbitrary number of digits. Instead you can
specify digits as optional XYZ[0-9][0-9]?[0-9]? or define
the amount of digits (1-3) explicitly XYZ[0-9]{1,3} which
still allows numbers out of range 1..299 (say, 0, 300) or
undesired syntaxes like 00 or 000. - Not sure it matters in
your case. If it matters, you can define alternatives with
a bar-symbol, e.g., XYZ([1-9]|[1-9][0-9]|[1-2][0-9][0-9])
that you group with parenthesis.
If Kate accepts XYZ([1-9]|[1-9][0-9]|[1-2][0-9][0-9]) as proper syntax,
it should do what I was trying to specify.

Having spent decades in QA/QC related tasks, I pay close attention to my
first reference explicitly warning that Kate does not handle regular
expressions exactly the same as some other editors. Hence the first
lined of my post ;}
Post by Janis Papanagnou
Where you put such regular expressions in your Kate editor
is known to you, I suppose?
Yes. There are some advantages to a GUI ;}
In fact I just tried it.
Post by Janis Papanagnou
XYZ hello world
XYZ0 hello world
XYZ017 hello world
XYZ1 hello world
XYZ34 hello world
XYZ999 hello world
XYZ hello world
XYZ0 hello world
XYZ017 hello world
abc hello world
abc hello world
abc9 hello world
Post by Richard Owlett
The documents give essentially no examples.
Regular expressions may first appear confusing, but the links
you posted actually has relevant examples.
No problem with regular expressions per se.
As my background was component level analog electronics, I ended up
working for DEC in Power Supply Engineering in mid-70's. My intro to
regular expressions was observing guys in adjacent department having fun
with TECO.
Post by Janis Papanagnou
Post by Richard Owlett
Help please.
TIA
Hope that helps.
IT DID!
Post by Janis Papanagnou
Janis
Richard Owlett
2024-07-03 18:47:04 UTC
Permalink
Post by Janis Papanagnou
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
Disclaimer: I don't know the Kate editor. But I know Regular
Expressions (RE).
Post by Richard Owlett
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
You may do that with simple patterns if you don't have, say,
strings like XYZ300 that shall be disregarded. Then the RE
may simply be XYZ[0-9]+ meaning any string XYZ that is
followed by an arbitrary number of digits. Instead you can
specify digits as optional XYZ[0-9][0-9]?[0-9]? or define
the amount of digits (1-3) explicitly XYZ[0-9]{1,3} which
still allows numbers out of range 1..299 (say, 0, 300) or
undesired syntaxes like 00 or 000. - Not sure it matters in
your case. If it matters, you can define alternatives with
a bar-symbol, e.g., XYZ([1-9]|[1-9][0-9]|[1-2][0-9][0-9])
that you group with parenthesis.
Where you put such regular expressions in your Kate editor
is known to you, I suppose?
Post by Richard Owlett
The documents give essentially no examples.
Regular expressions may first appear confusing, but the links
you posted actually has relevant examples.
Post by Richard Owlett
Help please.
TIA
Hope that helps.
Janis
The GUI version of Kate accepts XYZ([1-9]|[1-9][0-9]|[1-2][0-9][0-9])
with no problem.

I tried out on a real world test.

I'm converting a paragraph formatted KJV Bible [1][2] to a pleasant
reading experience for vision impaired seniors who have minimal computer
experience.

The search string below worked fine {unexpectedly, didn't have to escape
any non-alphanumeric characters}
<span class="verse" id="V([1-9]|[1-9][0-9]|[1-2][0-9][0-9])"

Kate had other useful features. I recognized a HTML construct that I
hadn't considered. I didn't show as a problem in my test browser. But
could have in another scenario. It can also do things in command line
mode. That should ease the task of processing >1000 text chapters
automatically. Guess I spend the rest of the day reading documentation :}

Thank you

This is a resend - original reply did not appear





[1] KJV Cambridge Paragraph Bible <https://ebible.org/engkjvcpb/>
[2] <https://ebible.org/Scriptures/engkjvcpb_html.zip>
Richard Owlett
2024-07-03 10:57:39 UTC
Permalink
Post by Richard Owlett
I have a Debian machine with Kate Version 16.08.3 .
I wish to do a search & replace using regular expressions.
The "Help" menu has led to
https://docs.kde.org/stable5/en/kate/katepart/regular-expressions.html
and
https://docs.kde.org/stable5/en/kate/katepart/regex-patterns.html
I have strings of the form "XYZn" where n is one to three digits
representing values of from 1 to 299. I wish to replace all occurrences
with "abc".
The documents give essentially no examples.
Help please.
TIA
This Appendix contains a brief but hopefully sufficient and
covering introduction to the world of regular
expressions. It documents regular expressions in the form
available within KatePart, which is not compatible with the regular
expressions of perl, nor with those of for example
grep.
Please note the phrasing "... which is not compatible ...".
This is a personal project. I started searching the web for information
on regular expressions. Circumstances require I use this particular
machine with its current software ] no updates ;[

My formal background is limited to a single introductory programming
course I took as a freshman E.E. student in '61. I learned by doing.

My web reading and a discussion in another forum has may has made me
aware that there is more than one way to handle regular expressions.

This time I asked on a "editor" focused group and specified the specific
editor I'm using.
Lawrence D'Oliveiro
2024-07-05 01:05:08 UTC
Permalink
Post by Richard Owlett
My web reading and a discussion in another forum has may has made me
aware that there is more than one way to handle regular expressions.
The Perl style seems to have become something of a de-facto standard.
Janis Papanagnou
2024-07-05 02:00:01 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Richard Owlett
My web reading and a discussion in another forum has may has made me
aware that there is more than one way to handle regular expressions.
I'm not quite sure what you mean by "handle" REs. There's tools
with different syntax and that support more or less functions,
even sometimes exceeding the class of a Regular grammar (at a
given cost). - I suppose you meant this?
Post by Lawrence D'Oliveiro
The Perl style seems to have become something of a de-facto standard.
Hardly. First, there's differences on the functional level; Perl
supports with their regexp library functions that are not part
of the Regular Expression grammar class, they exceed that class.
The consequence is that for that subset there's no O(N) (linear)
complexity guaranteed any more.
Second, there's syntactical differences between tools, that are
necessary to handle meta-characters in their specific language
context; in one tool meta-characters need, e.g., to be escaped
where in another context that's not necessary. How can something
be a standard when (standard-)tools do not support that.
Then there's sometimes syntactical convenience shortcuts in use
(here I'm thinking of Perl's escaped classes of common entities
and their negated forms); these are very handy especially where
these expressions get more complex.

Moreover, when speaking about [de facto] "standards"; what would
that mean in the light of existing (real) standards, like POSIX,
that define behavior of tools and the supported RE implementation
(BRE, ERE).
And finally shells (like Kornshell) that had since 1988 version
an own syntax (not comparable with BRE, ERE, Perl's, syntax), an
extension of the "wildcard" patterns. Also back-references, one
extension that doesn't guaranteed O(N) any more, had been added
later. Only later version supported the ERE syntax in addition
to the original Ksh-"patterns".

It's know that fans of specific products often use terms like
de-facto standard. Readers should be careful when spotting such
phrases, they are often nothing but marketing talk.

Usually you have requirements and have to make yourself familiar
with what the allowed tool chest supports (including the Regexp
facilities). Granted, getting familiar is harder than following
marketing suggestions. But there's (real) standards (as opposed
to "de-facto" standards), so if you're learning the standards
(RE oder otherwise) you may apply these in a broader context.

And if you have the time and the tools that support these "Perl
regexps", yet better, since they make some things appear tidier
and add convenient functional extensions. Note that Perl regexps
also follow (and extend) the basic syntax of the other standard
Regexps mentioned (BRE, ERE), so learning the basics first can
never be wrong.

Janis
Lawrence D'Oliveiro
2024-07-05 03:58:00 UTC
Permalink
Post by Janis Papanagnou
Post by Lawrence D'Oliveiro
The Perl style seems to have become something of a de-facto standard.
Hardly. First, there's differences on the functional level; Perl
supports with their regexp library functions that are not part of the
Regular Expression grammar class, they exceed that class. The
consequence is that for that subset there's no O(N) (linear) complexity
guaranteed any more.
Precisely. Many users of REs seem to feel it is useful to at least
have the option of such extensions, and they are willing to pay that
price.
Post by Janis Papanagnou
Second, there's syntactical differences between tools, that are
necessary to handle meta-characters in their specific language
context; in one tool meta-characters need, e.g., to be escaped where
in another context that's not necessary. How can something be a
standard when (standard-)tools do not support that.
Surely they do it the way that Perl does. Hence “Perl style”.
Post by Janis Papanagnou
Moreover, when speaking about [de facto] "standards"; what would
that mean in the light of existing (real) standards ...
It means there is a standard library (actually a whole bunch of them)
you can link against to immediately support that style of regular
expression.

<https://packages.debian.org/search?keywords=pcre&searchon=names&suite=stable&section=all>
Janis Papanagnou
2024-07-06 00:53:07 UTC
Permalink
Post by Lawrence D'Oliveiro
[...] First, there's differences on the functional level; Perl
supports with their regexp library functions that are not part of the
Regular Expression grammar class, they exceed that class. The
consequence is that for that subset there's no O(N) (linear) complexity
guaranteed any more.
Precisely. Many users of REs seem to feel it is useful to at least
have the option of such extensions, and they are willing to pay that
price.
There's two problems; one related to the Perl user, and one related
to the Perl implementor - if they're not aware about the line drawn
between regular and non-regular grammars. - The Perl user should be
informed about the complexities they buy with certain constructs
so that they are really "willing to pay" and not "pay a price that
they just did not expect". The implementor shall be aware that the
mechanism implemented should well differentiate between the two
domains of complexity for the respective functions, otherwise (as
happened in the past!) a common mechanism is implemented that even
for ordinary O(N) Regular Expressions non-linear complexities have
to be payed (unintentionally!) in certain, even simple, cases.

Note that this caveat is not Perl-specific; all pattern matchers
that support such functions, like e.g. back-references, should be
aware of that.

Janis

Loading...