Christian de Bellefeuille
asked on
RegEx to capture versions of each python module in Pipfile.lock
I'm trying to get a list of versions for each modules within a Pipfile.lock file.
The file look like this, alabaster, argh, babel, certifi are some modules names and we can see their file versions few lines below (it varry, some have more sha256 than others).
- I've tried several things like:
(".*"): {(.|\n|\r)*],"version": "(.*)" - But it break at the end of each line even if i add (.|\n|\r)*
Anyone an idea what i'm doing wrong and what i should write to get it working? I would like to capture the module name and version.
thanks
"default": {
"alabaster": {
"hashes": [
"sha256:446438bdcca0e05bd45ea2de1668c1d9b032e1a9154c2c259092d77031ddd359",
"sha256:a661d72d58e6ea8a57f7a86e37d86716863ee5e92788398526d58b26a4e4dc02"
],
"version": "==0.7.12"
},
"argh": {
"hashes": [
"sha256:a9b3aaa1904eeb78e32394cd46c6f37ac0fb4af6dc488daa58971bdc7d7fcaf3",
"sha256:e9535b8c84dc9571a48999094fda7f33e63c3f1b74f3e5f3ac0105a58405bb65"
],
"version": "==0.26.2"
},
"babel": {
"hashes": [
"sha256:1aac2ae2d0d8ea368fa90906567f5c08463d98ade155c0c4bfedd6a0f7160e38",
"sha256:d670ea0b10f8b723672d3a6abeb87b565b244da220d76b4dba1b66269ec152d4"
],
"version": "==2.8.0"
},
"certifi": {
"hashes": [
"sha256:017c25db2a153ce562900032d5bc68e9f191e44e9a0f762f373977de9df1fbb3",
"sha256:25b64c7da4cd7479594d035c08c2d809eb4aab3a26e5a990ea98cc450c320f1f"
],
"version": "==2019.11.28"
},
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
But i don't understand this part:
? is a quantifier for the character before it, but you don't have any character before it. The ( belong to the "group"/OR
(?:.|\n)+?
? is a quantifier for the character before it, but you don't have any character before it. The ( belong to the "group"/OR
?: appearing at the start of a group turns it into a non-capturing group. This group will match any character. Since the . (period) wildcard character doesn't match a new line character, I include the \n to match that.
I might have also used pairs of complementary meta characters:
I might have also used pairs of complementary meta characters:
(?:\s|\S)+?
(?:\w|\W)+?
(?:\d|\D)+?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
If you weren't aware of the ?: group modifier, you probably don't know about look-ahead and negative look-ahead modifiers. In the following pattern, I use the ?! negative look-ahead modifier. This will reject (not match) the "default", "hashes", and "version" names.
Note: As I was testing this example, I realized that I didn't need the uncaptured "hashes" match in my original solution comment. So, I left that out of this pattern.
"((?!default|hashes|version)[^"]+)":(?:.|\n)+?"version":\s+"([^"]+)"
Note: As I was testing this example, I realized that I didn't need the uncaptured "hashes" match in my original solution comment. So, I left that out of this pattern.
Christian
Are you still working on this problem?
Are you still working on this problem?
If this truly is a JSON file, then why not use a library for parsing JSON rather than trying to shoehorn a regex into parsing this? I think you'd have a much easier time, not to mention a much better reading experience further down the road. JSON support looks to be available out of the box in Python:
import json
with open('Pipfile.lock', 'r') as lockfile:
obj = json.load(lockfile)
for moduleName in obj['default']:
print(moduleName)
print(obj['default'][moduleName]['version'].lstrip('='))
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER