Python中JsonPath提取器和正则提取器

2024-03-30 07:03•python•阅读 2650

一、前言

我们一般在做接口关联时，会通过保存中间变量实现接口关联，在关联时就需要用到变量提取，那今天我们就介绍接口自动化中变量提取的两大神器：正则提取器和JsonPath提取器。

1.1 正则提取器

正则提取（正则表达式只能提取字符串的数据）

1、re.seach:只匹配一个值，通过下标[1]取值，没有匹配到返回None

2、re.findall:匹配多个值，返回列表list，多个值通过下标取值，没有返回None

1.2 正则示例：

import re
import requests

a = requests.get("http://www.baidu.com")
# print(a.text)

b = re.search('charset=(.*?)><meta http-equiv=X-UA-Compatible content=IE=Edge>', a.text)
print(b)
print(b.group())
print(b.groups())
print(b.group(1))

结果：

<re.Match object; span=(94, 157), match='charset=utf-8><meta http-equiv=X-UA-Compatible co>
charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge>
('utf-8',)
utf-8

匹配通配符：

我们一般用（.*?）和（.+?）来匹配我们需要提取的数值

解释：

. 表示任意一个字符
+ 表示匹配它前面的表达式1次或者多次
* 表示匹配它前面的表达式0次或者多次
? 表示匹配它前面的表达式1次或者多次

token = re.search('"token":"(.*?)",',res.text)[1]
print("token1:%s",%token)

token = re.findall('"token":"(.*?)",'res.text)
print("token2:%s",%token)

1.3 JsonPath提取器

JsonPath提取（JsonPath只能提取json格式的数据）

jsonpath.jsonpath ，返回的是一个list，通过下标取值，没有返回None

JsonPath语法

符号	描述
$	查询的根节点对象，用于表示一个json数据，可以是数据或者对象
@	过滤器，处理的当前节点对象
*	获取所有节点
.	获取子节点
. .	递归搜索，筛选所有符合条件的节点
?()	过滤器表达式，筛选操作
[a]或者[a,b]	迭代器下标，表示一个或多个数组下标

1.4 JsonPath提取器具体使用

下面使用一个JSON文档演示JSONPath的具体使用。JSON 文档的内容如下：

{
  "store": {
    "book":[
      { "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      { "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
      "color": "red",
      "price": 19.95
    }
  }
}

1、假设变量bookJson中已经包含了这段json字符串，可以通过一下代码反序列化得到json对象：

books=json.loads(bookJson)

2、查看store下的bicycle的color属性

checkurl = "$.store.bicycle.color"
print(jsonpath.jsonpath(data, checkurl))
# 输出：['red']

3、输出book节点中包含的所有对象

checkurl = "$.store.book[*]"
object_list = jsonpath.jsonpath(data, checkurl)
print(object_list)

#输出
[{'category': 'reference', 'author': 'Nigel Rees', 'title': 'Sayings of the Century', 'price': 8.95},
{'category': 'fiction', 'author': 'J. R. R. Tolkien', 'title': 'The Lord of the Rings', 'isbn': '0-395-19395-8', 'price': 22.99}]

4、输出book节点的第一个对象

checkurl = "$.store.book[0]"
obj = jsonpath.jsonpath(data, checkurl)
print(obj)
# 输出: ['category': 'reference', 'author': 'Nigel Rees', 'title': 'Sayings of the Century', 'price': 8.95}]

5、输出book节点中所有对象对应的属性title值

checkurl = "$.store.book[*].title"
titles = jsonpath.jsonpath(data, checkurl)
print(titles)
# 输出: ['Sayings of the Century', 'The Lord of the Rings']

6、输出book节点中category为fiction的所有对象

checkurl = "$.store.book[?(@.category=='fiction')]"
books=jsonpath.jsonpath(data, checkurl)
print(books)
#输出
[{'category': 'fiction', 'author': 'J. R. R. Tolkien', 'title': 'The Lord of the Rings', 'isbn': '0-395-19395-8', 'price': 22.99}]

7、输出book节点中所有价格小于10的对象

checkurl="$.store.book[?(@.price<10)]"
books = jsonpath.jsonpath(data, checkurl)
print(books)
# 输出: [{'category': 'reference', 'author': 'Nigel Rees', 'title':'Sayings of the Century', 'price': 8.95}]

8、输出book节点中所有含有isb的对象

checkurl = "$.store.book[?(@.isbn)]"
books = jsonpath.jsonpath(data,checkurl)
print(books)
# 输出: [{'category': 'fiction', 'author': 'J. R. R. Tolkien', 'title': 'The Lord of the Rings', 'isbn': '0-395-19395-8'， 'price': 22.99}]

原文地址：https://blog.csdn.net/weixin_44244493/article/details/129766306

上一篇 »CSS 学习笔记
下一篇 »如何从HTML字符串中提取Img、video的标签和地址？

Python中JsonPath提取器和正则提取器

目录

一、前言

1.1 正则提取器

1.2 正则示例：

1.3 JsonPath提取器

1.4 JsonPath提取器具体使用

相关推荐

调用下面的方法屏蔽所有html标签提取文本

HTML中放置CSS的三种方式和CSS选择器

python 数据分析--数据可视化工具matplotlib

webpack 之，10 css 提取,兼容,压缩

python之装饰器

PYTHON将列表存储为csv文件以及从csv中提取数据

JavaScript——正则匹配、正则提取、正则替换

atitit. java jsoup html table的读取解析 总结

atitit. java jsoup html table的读取解析总结