Extracting information out of logseq

How to Extract Specific Information from Logseq Using Awk

Steps

Make a backup of the repo

Use gawk to extract the section you need. -i inplace will update the file it's working on

gawk -i inplace -f extract.awk journals/*

This is more advanced than the basic find and replace techniques - gawk gives you pattern-based text extraction:

BEGIN {
    true = 1
    false = 0
    printLine = false
}

{
    if ($0 ~ /^- \[\[{HEADER}\]\]:/) {
        printLine = true
    } else if ($0 ~ /^-[:space:]*/) {
        printLine = false
    }
 
    if (printLine) print $0
}

find and delete all zero-byte files

find journals/* -size 0 -print -delete

Update the assets folder to include only the assets your new files need

Find all assets you're using, and delete the ones you're not

grep "\.\.\/assets\/.*)" journals/* -oh | awk '{print substr($0, 11, length($0)-11)}'
cd assets
setopt EXTENDED_GLOB
rm -- ^(X|Y|Z)

Usecase

I wanted to extract my daily work interactions from my journal, which I record under a page Work/Pepper Content

The awk script made it easy to extract things.