Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 342 Vote(s) - 3.46 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Ruby String split with regex

#1
This is **Ruby 1.8.7** but should be same as for 1.9.x

I am trying to split a string for example:

a = "foo.bar.size.split('.').last"
# trying to split into ["foo", "bar","split('.')","last"]

Basically splitting it in commands it represents, I am trying to do it with Regexp but not sure how, idea was to use regexp

a.split(/[a-z\(\)](\.)[a-z\(\)]/)

Here trying to use group `(\.)` to split it with but this seems not to be good approach.
Reply

#2
I'm afraid that regular expressions won't take you very far. Consider for example the following expressions (which are also valid Ruby)

"(foo.bar.size.split( '.' )).last"
"(foo.bar.size.split '.').last"
"(foo.bar.size.split '( . ) . .(). .').last"

The problem is, that the list of calls is actually a tree of calls. The easiest solution in sight is probably to use a Ruby parser and transform the parse tree according to your needs (in this example we are recursively descending into the call tree, gathering the calls into a list):

# gem install ruby_parser
# gem install awesome_print
require 'ruby_parser'
require 'ap'

def calls_as_list code
tree = RubyParser.new.parse(code)

t = tree
calls = []

while t
# gather arguments if present
args = nil
if t[3][0] == :arglist
args = t[3][1..-1].to_a
end
# append all information to our list
calls << [t[2].to_s, args]
# descend to next call
t = t[1]
end

calls.reverse
end

p calls_as_list "foo.bar.size.split('.').last"
#=> [["foo", []], ["bar", []], ["size", []], ["split", [[:str, "."]]], ["last", []]]
p calls_as_list "puts 3, 4"
#=> [["puts", [[:lit, 3], [:lit, 4]]]]

And to show the parse tree of any input:

ap RubyParser.new.parse("puts 3, 4")
Reply

#3
a = "foo.bar.size.split('.').last"
p a.split(/(?<!')\.(?!')/)

#=> ["foo", "bar", "size", "split('.')", "last"]

You are looking for Lookahead and Lookbehind assertions.

[To see links please register here]

Reply

#4
here I don't have ruby env. I tried with python re.split().

In : re.split("(?<!')\.(?!')",a)
Out: ['foo', 'bar', 'size', "split('.')", 'last']

the regex above has negative lookahead **AND** lookbehind, to make sure only the "dot" **between** single quotes won't work as separator.

of course, for the given example by you, one of lookbehind or lookahead is sufficient. you can choose the right way for your requirement.
Reply

#5
I think this would do it:

a.split(/\.(?=[\w])/)

I don't know how much you know about regex, but the `(?=[\w])` is a lookahead that says "only match the dot if the next character is a letter kind of character". A lookahead won't actually grab the text it matches. It just "looks". So the result is exactly what you're looking for:

> a.split(/\.(?=[\w])/)
=> ["foo", "bar", "size", "split('.')", "last"]

Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through