Skip to main content

Extracting a single field from a very long json file

Trivial task, but still can save some time to somebody, so I am gladly sharing.
A friend of a mine has an huge json file, and she had to extract all unique value for a field called "title".  The file was too big to be processed from a notepad or an excel.

With those comands, I was able to obtain a clean, unique and sorted list list of all the content.


grep -o -E '"title":"[^"]+",' tmp.json | sort |uniq > output.txt

sed -i 's/"title":"//g' output.txt

sed -i 's/",//g' output.txt


Comments

Unknown said…
You can do the same with just one command:
sed -n "s/^.*\"title\":\"\([^\"]*\)\",.*/\1/p" tmp.json | uniq | sort > output.txt

Popular posts from this blog

Multiple controllers with Spring Boot

Remember, when you want to have multiple controllers with Spring Boot, you should always name them differently in the annotation, otherwise they will not work So these two together will NOT work (or just one of them will work) These two instead WILL work.

When adding a property to graph-tool don't forget this

When you are adding a new property to a graph-tool graph (https://graph-tool.skewed.de/) is_node_customer = network.new_vertex_property("bool") weight = network.new_edge_property("float")  you should never forget to add this network.vertex_properties["is_node_customer"] = is_node_customer network.edge_properties["weight"] = weight Otherwise the properties will not saved or stored together with the graph and you will lose a lot of time