Comments (10)
You're welcome to take it over. I've already implemented find
and count
, in case that helps: string_funcs.diff
from zeek.
If nobody is currently working on this, I'm going to write a bunch of string BiFs this week. Here is the list of functions I'll start tackling (inspired by Python string functions):
str.rfind
str.rstrip
str.lstrip
- I see the notes above about
sub
being able to dolstrip
andrstrip
. I think this will be more clear/intuitive, even if its potentially redundant.
- I see the notes above about
str.endswith
str.isnum
str.isalpha
str.isalnum
str.ljust
str.rjust
str.swapcase
str.to_title
str.zfill
from zeek.
As noted above, the two strip methods are done. If we're going to mimic the python methods, here's what the signatures would look like:
function count%(str: string, sub: string%): count
: Takes a string and a substring to search for, returns the number of times that substring is seen.function find%(str: string, sub: string, start: count &default=0, end: count &default=0%): count
: Takes a string and a substring to search for, returns the index of the start of the substring within the string. Also can take an index to start searching from and an index to stop searching at.start
should always be less thanend
.function rfind%(str: string, sub: string, start: count &default=0, end: count &default=0%): count
: The same asfind
but searches in reverse. Takes a string and a substring to search for, returns the index of the start of the substring within the string. Also can take an index to start searching from and an index to stop searching at.start
should always be greater thanend
.function startswith%(str: string, sub: string%): bool
: Returns true or false whether a string starts with a substring. This is easily implemented withfind
.function endswith%(str: string, sub: string%): bool
: Returns true or false whether a string ends with a substring. This is easily implemented withrfind
.function isnum%(str: string%): bool
: Returns whether the entire string represents a number.function isalpha%(str: string%): bool
: Returns whether the entire string is alphabetic characters.function isalnum%(str: string%): bool
: Returns whether the entire string is alphanumeric characters.function ljust%(str: string, width: count, fill: string%): string
: Returns a copy of a string, left-justified within a number of characters defined bywidth
. The extra characters are filled in withfill
. If the string passed forfill
is more than one character in length, an error is thrown.function rjust%(str: string, width: count, fill: string%): string
: Returns a copy of a string, right-justified within a number of characters defined bywidth
. The extra characters are filled in withfill
. If the string passed forfill
is more than one character in length, an error is thrown.function swapcase%(str: string%): string
: Returns a copy of the string with the cases of all of the character within that string swapped. For example, the stringaBc
would be returned asAbC
.function to_title%(str: string%): string
: Returns a copy of the string in titlecase. This means that the first letter of each word in the string will be capitalized. For more info, see https://docs.python.org/2/library/stdtypes.html#str.titlefunction zfill%(str: string, width: count%): string
: Returns a copy of the string filled on the left side with zeroes. This is effectivelyrjust(str, width, "0")
.
Some questions:
- We already have
strstr
, which is effectively the same thing asfind
. - I could definitely see other versions of
find
andrfind
that take patterns, but there's a question of whether those versions should return the position or the string that matches. We already havefind_last
, but it takes a pattern and not a string. It returns the matched string and not a count.
from zeek.
if you are copying python, also see https://www.python.org/dev/peps/pep-0616/
add two new methods, removeprefix() and removesuffix(), to the APIs of Python's various string objects. These methods would remove a prefix or suffix (respectively) from a string, if present
reason being the current strip/lstrip methods are often mis-used under the believe that they only remove the literal argument:
print rstrip("banana", "na");
outputs b
not bana
.
from zeek.
Do these have some sort of module scoping to avoid name collisions? count
and find
seem ripe for such. (In fact, I'd think count
would get confused with the type.)
from zeek.
Do these have some sort of module scoping to avoid name collisions?
The existing string functions don't at the moment, but there's nothing that would stop us from moving them and leaving deprecated versions in the global namespace.
from zeek.
Python's isnumeric
method simply checks to see if every character in the string is a number. Is that sufficient or should it actually check to see if the string is a double, negative, etc? How about things like other bases? Scientific notation?
from zeek.
I'd recommend making it the same as python - less cognitive load for devs who code in both languages.
from zeek.
find
and rfind
technically work in Python by taking the substring between the indexes and then searching within that substring. This effectively means that if you pass a range smaller than the search string, it'll always return a failure. I currently have it implemented differently, where the start and end positions are related to the start of the match. This means that if the end of the match is past the requested end position, it'll still return success. Does that sound good or should I swap it to the python version?
from zeek.
Unless there is a significant benefit from deviating, I think consistency is best
from zeek.
Related Issues (20)
- TLS parser rewrite in Spicy
- Update `Tunnel::max_depth` default to reflect modern encapsulation environments HOT 3
- Spicy ignores Zeek-side FUID generation
- Deprecate specifying ports in Spicy EVT files
- Provide Spicy-side function telling analyzer to stop parsing data
- Spicy-level enum docstrings should be visible in zeekygen-generated docs
- Add DNSSEC signature algorithms Ed25519 & Ed448 HOT 3
- `split_string_n` returns incorrect match if pattern includes `^` HOT 1
- Null-deref in WhenStmt description HOT 1
- Unregistered opaque types in telemetry framework
- Add `min` and `max` functions for container types HOT 3
- script_id$type_name values incomplete for composite types
- Don't `FatalError` on pcap read errors HOT 3
- Add QUIC v2 support HOT 1
- ldap.log/ldap_search.log: Use scalar values where "more" appropriate HOT 1
- spicy script help for packet analyser HOT 1
- zeekctl netstats gets stuck sometimes if zeek worker crashed
- Record fields with `&default` constructors and expiration attributes may not trigger their timers, causing state leaks HOT 1
- Poor diagnostics if function use does not match signature
- can't find policy/protocols/ssl/extract-certs-pem HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zeek.