So you need to get a substring between two strings? A quick Google returns about 101,000 results, the first three of which are from StackOverflow and solve the problem for rather specific use-cases. I wanted to solve the problem once-and-for-all in a more general-purpose way.
The “Find Between” Function
/**
* Finds a substring between two strings
* @param string $string The string to be searched
* @param string $start The start of the desired substring
* @param string $end The end of the desired substring
* @param bool $greedy Use last instance of`$end` (default: false)
* @return string
*/
function find_between(string $string, string $start, string $end, bool $greedy = false) {
$start = preg_quote($start, '/');
$end = preg_quote($end, '/');
$format = '/(%s)(.*';
if (!$greedy) $format .= '?';
$format .= ')(%s)/';
$pattern = sprintf($format, $start, $end);
preg_match($pattern, $string, $matches);
return $matches[2];
}
If you’re running PHP 5.6 or lower you’ll need to remove the typehints
Nothing too flashy here, it escapes the start and end strings, constructs a regex pattern and returns whatever comes between.
Usage Example
$string = 'fizz foo bar foo foo';
$start = 'foo';
$end = 'foo';
$greedy = false;
var_dump(find_between($string, $start, $end));
// string(5) " bar "
By default it’ll find the shortest string possible (i.e. stop at the first instance of the end string) but if you pass true
to $greedy
then it’ll keep looking until it finds the last instance of the end string.
For example: foo bar foo foo
with foo
as both start and end string, greedy will return bar foo
and non-greedy will return only bar
.
A note on WordPress
A common complaint in the WordPress community is the inability to nest shortcodes of the same name, e.g. [short][short]Hello, World![/short][/short]
because the regex which parses content for shortcodes is lazy and will stop at the first closing tag.
Assuming [short] wraps its contents in a bold tag, you’d get something like <b>[short]Hello, World!</b>[/short]
as the output because the inner shortcode doesn’t get passed to the handling function in the $content
parameter as you’d expect it to.
It’s a frustrating behaviour and leads developers to roll their own solutions, such as registering multiple shortcodes (e.g. [short]
, [short-inner]
, etc.) or even to register shortcodes dynamically if one is found in the content with a given prefix (e.g. [short]
, [short-foo]
, [short-bar]
, etc.) but that’s not particularly user-friendly, defying the whole point of shortcodes.
This might be fixed in the future (see https://core.trac.wordpress.org/ticket/14481) but for now, you’ll have to roll your own solution too.
Conclusion
While this is by no means a new problem and the solution is hardly revolutionary, it at least provides a super-simple way to get the job done. No more writing regex by hand (and no more substr
-ing, explode
-ing and implode
-ing!) just a clear and simple function to solve a common annoyance.
Hey Rich,
I tried your solution and it wasn’t working for me so I did some debugging and fixed it.
There is an issue in this block:
if ($trim) {
$string = substr($string, strlen($start));
$string = substr($string, 0, -strlen($start) + 1);
}
Second line in this block should be: $string = substr($string, 0, -strlen($end)); as after trimming `start`, you need to trim length of `end` from the right side of string.
but later I found the function preg_match also returns trimmed string i.e.
Array
(
[0] => quick brown fox jump
[1] => brown fox
)
So I tweaked above function and here is final outcome:
function find_between($string, $start, $end, $trim=true, $greedy=false)
{
$stringOut = ”;
$pattern = ‘/’.preg_quote($start).'(.*’;
if (!$greedy) {
$pattern .= ‘?’;
}
$pattern .= ‘)’.preg_quote($end).’/’;
preg_match($pattern, $string, $matches);
if (count($matches)>1) {
if ($trim) {
$stringOut = $matches[1];
} else {
$stringOut = $matches[0];
}
}
return ($stringOut===”)?false:$stringOut;
}
Hope this helps,
Waqar
Hi Waqar,
Thanks but I can’t see a difference in the results. Have a look at https://3v4l.org/FRkMQ and tell me what you think.
Hey Rich,
will your code work on same strings?
I (like) you Rich, Do you (like) me?
Result?: (you rich, Do you)
Hi Waseem, turned out it didn’t handle that very well at all so I re-wrote the function and updated the post. Should be a bit more reliable now!
Doesn’t work. Tried all your examples. Even the one in the comments. It throws a Warning, which doesn’t help to retrieve JSON.
This works fine.
function get_string_between($string, $start, $end){
$string = ‘ ‘ . $string;
$ini = strpos($string, $start);
if ($ini == 0) return ”;
$ini += strlen($start);
$len = strpos($string, $end, $ini) – $ini;
return substr($string, $ini, $len);
}
Hi Nick! Which version of PHP are you running? I’ve tested on PHP 7.0, 7.1 and 7.2 but it obviously won’t work on 5.6 or lower because of the typehints. If you remove them from the function it should work as expected :)
You are correct, I am running 5.6! Thanks for letting me know.
Awesome, hope it’s working for you now! I’ve added a note to the post so this is more obvious in future :)
Hi Rich,
You’re function is what I need, however when I use the following:
$string = ‘url: bvjie2bvij23bevije2vbei2jvbo2.mp3,’;
$start = ‘url: ‘;
$end = ‘,’;
here’s the var_dump:
string(30) “bvjie2bvij23bevije2vbei2jvbo2.”
Note that it is truncating after the . before mp3 rather than before the , after mp3. I am running php 7.2
Thanks.
Alan
Hi Alan,
That looks like a really annoying little misbehaviour! Unfortunately I can’t reproduce it — have a look at this example: http://sandbox.onlinephpfunctions.com/code/9de5a0f5d26fb5db7b2c2f5853430814c3d07819
Seems to work on PHP 7.x.x although obviously the type hints will cause an error on previous versions.
My first thought was that maybe the . wasn’t being escaped and the pattern was therefore looking for a single character, which it found in a . literal, but it looks like `preg_quote()` is taking care of that.
Let me know if you get it working!
Hi Rich,
I’m getting an error ( preg_match(): Unknown modifier ‘p’).
Can I use ‘/’ to close tags?
http://sandbox.onlinephpfunctions.com/code/d212085aabb8d418f5e46197e0444dc6c53a3c2d
cheers!
Hi Felipe!
I’ve found the problem and updated this post, thanks for commenting because I never would’ve caught this otherwise! :)
The problem was that I hadn’t specified the delimiter when using `preg_quote` so it wasn’t escaping the forward slashes in your string.
Easy mistake, easy fix!
See https://www.php.net/manual/en/function.preg-quote.php