php 正则表达式

// 用preg_match 模仿实现 preg_match_all
function custom_preg_match_all($pattern, $subject)
{
    $offset = 0;
    $match_count = 0;
    while(preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, $offset))
    {
        // Increment counter
        $match_count++;
    
        // Get byte offset and byte length (assuming single byte encoded)
        $match_start = $matches[0][1];
        $match_length = strlen($matches[0][0]);

        // (Optional) Transform $matches to the format it is usually set as (without PREG_OFFSET_CAPTURE set)
        foreach($matches as $k => $match) $newmatches[$k] = $match[0];
        $matches = $newmatches;
    
        // Your code here
        echo "Match number $match_count, at byte offset $match_start, $match_length bytes long: ".$matches[0]."\r\n";
            
        // Update offset to the end of the match
        $offset = $match_start + $match_length;
    }

    return $match_count;
}

 正则匹配练习:

文本源: <div>2</div> 2<div>1</div>abcd

模式1:

<div>.*</div>

匹配结果:

<div>2</div> 2<div>1</div>

模式2

<div>.*?</div>

匹配结果:

<div>2</div>

<div>1</div>

对于模式2,如果文本源变为:

<div><div>2</div> 2<div>1</div>abcd

匹配结果:

Match 1: <div><div>2</div> 1 17

Match 2: <div>1</div> 20 12

显然有问题;

解决办法:

模式3:

<div>((?!</?div>).)*</div>

匹配结果:

Match 1: <div>2</div> 5 12

Group 1: 2 10 1

Match 2: <div>1</div> 19 12

Group 1: 1 24 1