[pacman-dev] [PATCH 0/1] Simple config parser added to scripts library
We're aware of many duplicates in our scripts, among which are things like eval $(awk '/DBPath/ {print $1$2$3}' @sysconfdir@/pacman.conf) which is obviously utterly hackish and potentially dangerous, yet (shockingly) quite a common practice in our scripts. Trying to fix all these here and there is a waste of time. As suggested by Dave, I took the liberty to add a config parser to the scripts library. Currently there're only two functions conf_key_val and conf_key_set, which are modified from pacman-key's get_from(). The usage is a bit different, as described in the commit message. If we agree on this change, feel free to port our scripts to use this. lolilolicon (1): scripts/library: add config_parser.sh scripts/library/README | 3 ++ scripts/library/config_parser.sh | 56 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 0 deletions(-) create mode 100644 scripts/library/config_parser.sh -- 1.7.6.4
config_parser.sh provides functions to support parsing simple config files, pacman.conf in particular. Our scripts should never use a simple eval to get config values such as DBPath. conf_key_val() is modified from pacman-key's get_from() function. Please note conf_key_val() and conf_key_set() both read config file from stdin, instead of $1. Signed-off-by: lolilolicon <lolilolicon@gmail.com> --- scripts/library/README | 3 ++ scripts/library/config_parser.sh | 56 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 0 deletions(-) create mode 100644 scripts/library/config_parser.sh diff --git a/scripts/library/README b/scripts/library/README index 1e9c962..4fb70dd 100644 --- a/scripts/library/README +++ b/scripts/library/README @@ -1,6 +1,9 @@ This folder contains code snippets that can be reused by multiple scripts. A brief description of each file follows. +config_parser.sh: +Parses simple config files such as pacman.conf. + output_format.sh: Provides basic output formatting functions with levels 'msg', 'msg2', 'warning' and 'error'. The 'msg' amd 'msg2' functions print to stdout diff --git a/scripts/library/config_parser.sh b/scripts/library/config_parser.sh new file mode 100644 index 0000000..1d125eb --- /dev/null +++ b/scripts/library/config_parser.sh @@ -0,0 +1,56 @@ +shopt -s extglob + +# Parse value of a non-repeating key from a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# $2: default value if not defined in config file [optional] +# return: +# 0 - print value of key +# 1 - print nothing +conf_key_val() { + local key val + while IFS='=' read -r key val; do + # strip whitespaces + key=${key##*([[:space:]])} + key=${key%%*([[:space:]])} + val=${val##*([[:space:]])} + val=${val%%*([[:space:]])} + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key || -z $val ]] && continue + # return success as soon as a value is found + if [[ $key == "$1" ]]; then + printf '%s' "$val" + return 0 + fi + done + # fallback to default value if specified + if [[ -n $2 ]]; then + printf '%s' "$2" + return 0 + fi + return 1 +} + +# Check whether an on/off key is set (on) in a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# return: +# 0 - key is set +# 1 - key is not set +conf_key_set() { + local key + while read -r key; do + # strip whitespaces, ignore comments + key=${key##*([[:space:]])} + key=${key%%*([[:space:]])} + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key ]] && continue + # return success as soon as the key is found + if [[ $key == "$1" ]]; then + return 0 + fi + done + return 1 +} -- 1.7.6.4
config_parser.sh provides functions to support parsing simple config files, pacman.conf in particular. Our scripts should never use a simple eval to get config values such as DBPath. conf_key_val() is modified from pacman-key's get_from() function. Please note conf_key_val() and conf_key_set() both read config file from stdin, instead of $1. Signed-off-by: lolilolicon <lolilolicon@gmail.com> --- scripts/library/README | 3 ++ scripts/library/config_parser.sh | 56 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 0 deletions(-) create mode 100644 scripts/library/config_parser.sh diff --git a/scripts/library/README b/scripts/library/README index 1e9c962..4fb70dd 100644 --- a/scripts/library/README +++ b/scripts/library/README @@ -1,6 +1,9 @@ This folder contains code snippets that can be reused by multiple scripts. A brief description of each file follows. +config_parser.sh: +Parses simple config files such as pacman.conf. + output_format.sh: Provides basic output formatting functions with levels 'msg', 'msg2', 'warning' and 'error'. The 'msg' amd 'msg2' functions print to stdout diff --git a/scripts/library/config_parser.sh b/scripts/library/config_parser.sh new file mode 100644 index 0000000..1d125eb --- /dev/null +++ b/scripts/library/config_parser.sh @@ -0,0 +1,56 @@ +shopt -s extglob + +# Parse value of a non-repeating key from a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# $2: default value if not defined in config file [optional] +# return: +# 0 - print value of key +# 1 - print nothing +conf_key_val() { + local key val + while IFS='=' read -r key val; do + # strip whitespaces + key=${key##*([[:space:]])} + key=${key%%*([[:space:]])} + val=${val##*([[:space:]])} + val=${val%%*([[:space:]])} + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key || -z $val ]] && continue + # return success as soon as a value is found + if [[ $key == "$1" ]]; then + printf '%s' "$val" + return 0 + fi + done + # fallback to default value if specified + if [[ -n $2 ]]; then + printf '%s' "$2" + return 0 + fi + return 1 +} + +# Check whether an on/off key is set (on) in a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# return: +# 0 - key is set +# 1 - key is not set +conf_key_set() { + local key + while read -r key; do + # strip whitespaces, ignore comments + key=${key##*([[:space:]])} + key=${key%%*([[:space:]])} + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key ]] && continue + # return success as soon as the key is found + if [[ $key == "$1" ]]; then + return 0 + fi + done + return 1 +} -- 1.7.6.4
On Fri, Sep 30, 2011 at 07:37:48PM +0800, lolilolicon wrote:
config_parser.sh provides functions to support parsing simple config files, pacman.conf in particular. Our scripts should never use a simple eval to get config values such as DBPath.
conf_key_val() is modified from pacman-key's get_from() function. Please note conf_key_val() and conf_key_set() both read config file from stdin, instead of $1.
I don't see the advantage of doing it this way. If/when we add a function like conf_keyval_set() (which actually updates a file) we'll have to ignore this convention and pass the filename, for obvious reasons. At the risk of painting my shed green, I dislike your naming convention, particularly conf_key_set, which doesn't set anything at all. I would have thought that something such as the following would be appropriate: # accessors conf_key_isset() # determines existance of a key w/o an associated value. conf_keyval_get() # gets the value of a key. # mutators conf_key_set() # sets a key without a value, as long as it doesn't exist. conf_keyval_set() # sets a key with a value, as long as it doesn't exist. And of course, accessors return 0/1 to indicate existance, mutators return 0/1 to indicate write success/failure. d
Signed-off-by: lolilolicon <lolilolicon@gmail.com> --- scripts/library/README | 3 ++ scripts/library/config_parser.sh | 56 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 0 deletions(-) create mode 100644 scripts/library/config_parser.sh
diff --git a/scripts/library/README b/scripts/library/README index 1e9c962..4fb70dd 100644 --- a/scripts/library/README +++ b/scripts/library/README @@ -1,6 +1,9 @@ This folder contains code snippets that can be reused by multiple scripts. A brief description of each file follows.
+config_parser.sh: +Parses simple config files such as pacman.conf. + output_format.sh: Provides basic output formatting functions with levels 'msg', 'msg2', 'warning' and 'error'. The 'msg' amd 'msg2' functions print to stdout diff --git a/scripts/library/config_parser.sh b/scripts/library/config_parser.sh new file mode 100644 index 0000000..1d125eb --- /dev/null +++ b/scripts/library/config_parser.sh @@ -0,0 +1,56 @@ +shopt -s extglob + +# Parse value of a non-repeating key from a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# $2: default value if not defined in config file [optional] +# return: +# 0 - print value of key +# 1 - print nothing +conf_key_val() { + local key val + while IFS='=' read -r key val; do + # strip whitespaces + key=${key##*([[:space:]])} + key=${key%%*([[:space:]])} + val=${val##*([[:space:]])} + val=${val%%*([[:space:]])}
read is the opportune thing to use here -- it's faster (which surprises me) and more concise. read -r key <<< "$key" read -r val <<< "$val"
+ # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key || -z $val ]] && continue + # return success as soon as a value is found + if [[ $key == "$1" ]]; then + printf '%s' "$val" + return 0 + fi + done + # fallback to default value if specified + if [[ -n $2 ]]; then + printf '%s' "$2" + return 0 + fi + return 1 +} + +# Check whether an on/off key is set (on) in a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# return: +# 0 - key is set +# 1 - key is not set +conf_key_set() { + local key + while read -r key; do + # strip whitespaces, ignore comments + key=${key##*([[:space:]])} + key=${key%%*([[:space:]])} + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key ]] && continue + # return success as soon as the key is found + if [[ $key == "$1" ]]; then + return 0 + fi + done + return 1 +} -- 1.7.6.4
On Fri, Sep 30, 2011 at 8:12 PM, Dave Reisner <d@falconindy.com> wrote:
On Fri, Sep 30, 2011 at 07:37:48PM +0800, lolilolicon wrote:
config_parser.sh provides functions to support parsing simple config files, pacman.conf in particular. Our scripts should never use a simple eval to get config values such as DBPath.
conf_key_val() is modified from pacman-key's get_from() function. Please note conf_key_val() and conf_key_set() both read config file from stdin, instead of $1.
I don't see the advantage of doing it this way. If/when we add a function like conf_keyval_set() (which actually updates a file) we'll have to ignore this convention and pass the filename, for obvious reasons.
Well I called it config_parser. I meant for it to *read* files only. For readers reading from stdin is more natural. I see both functions as a variant of `read'... We can pipe around more easily, without the need of process substitution. And we don't have a convention yet. After all, getters and setters are two differnt things, I don't think we have to unify the syntax.
At the risk of painting my shed green, I dislike your naming convention, particularly conf_key_set, which doesn't set anything at all. I would have thought that something such as the following would be appropriate:
# accessors conf_key_isset() # determines existance of a key w/o an associated value. conf_keyval_get() # gets the value of a key.
# mutators conf_key_set() # sets a key without a value, as long as it doesn't exist. conf_keyval_set() # sets a key with a value, as long as it doesn't exist.
And of course, accessors return 0/1 to indicate existance, mutators return 0/1 to indicate write success/failure.
d
Again I didn't intend to implement mutators. But if I did, I would name them: # accessors conf_key_val # print the value of key conf_key_set # Is key set? # mutators conf_set_key # set a key to either "on" or =value [1] conf_unset_key # simply comment the line(s) setting key [1] conf_set_key would add the =value part only if value is specified: conf_set_key conf key # set key on conf_set_key conf key value # set key to value I think it would be pretty clear _by comparison_ that: - The 'set' in conf_key_set is a past participle. => accessor - The 'set' in conf_set_key is a verb. => mutator It would be hard to mix those two up if we implemented them both. At last, I just wanted to make myself clear. I'm open to name changes :)
Signed-off-by: lolilolicon <lolilolicon@gmail.com> --- scripts/library/README | 3 ++ scripts/library/config_parser.sh | 56 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 0 deletions(-) create mode 100644 scripts/library/config_parser.sh
diff --git a/scripts/library/README b/scripts/library/README index 1e9c962..4fb70dd 100644 --- a/scripts/library/README +++ b/scripts/library/README @@ -1,6 +1,9 @@ This folder contains code snippets that can be reused by multiple scripts. A brief description of each file follows.
+config_parser.sh: +Parses simple config files such as pacman.conf. + output_format.sh: Provides basic output formatting functions with levels 'msg', 'msg2', 'warning' and 'error'. The 'msg' amd 'msg2' functions print to stdout diff --git a/scripts/library/config_parser.sh b/scripts/library/config_parser.sh new file mode 100644 index 0000000..1d125eb --- /dev/null +++ b/scripts/library/config_parser.sh @@ -0,0 +1,56 @@ +shopt -s extglob + +# Parse value of a non-repeating key from a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# $2: default value if not defined in config file [optional] +# return: +# 0 - print value of key +# 1 - print nothing +conf_key_val() { + local key val + while IFS='=' read -r key val; do + # strip whitespaces + key=${key##*([[:space:]])} + key=${key%%*([[:space:]])} + val=${val##*([[:space:]])} + val=${val%%*([[:space:]])}
read is the opportune thing to use here -- it's faster (which surprises me) and more concise.
read -r key <<< "$key" read -r val <<< "$val"
Nice tip! I like it, and it also does away the need of extglob...
On Fri, Sep 30, 2011 at 7:12 AM, Dave Reisner <d@falconindy.com> wrote:
On Fri, Sep 30, 2011 at 07:37:48PM +0800, lolilolicon wrote:
config_parser.sh provides functions to support parsing simple config files, pacman.conf in particular. Our scripts should never use a simple eval to get config values such as DBPath.
conf_key_val() is modified from pacman-key's get_from() function. Please note conf_key_val() and conf_key_set() both read config file from stdin, instead of $1.
I don't see the advantage of doing it this way. If/when we add a function like conf_keyval_set() (which actually updates a file) we'll have to ignore this convention and pass the filename, for obvious reasons. Agreed- reading from stdin is a bit overkill for these purposes.
At the risk of painting my shed green, I dislike your naming convention, particularly conf_key_set, which doesn't set anything at all. I would have thought that something such as the following would be appropriate:
# accessors conf_key_isset() # determines existance of a key w/o an associated value. conf_keyval_get() # gets the value of a key. These are pacman.conf only, right? Why not pacman_conf_get() pacman_conf_isset()
Another set of issues that may/may not be covered, but I didn't see anything at first glance: 1) is this [options] only, or are sections not even handled? Big problem if this is to be made universal, or one is looking up SigLevel... I think one needs to call this with both a file path and a section name. We probably also need a function that returns all the section names (pacman_conf_get_sections?). 2) How are multivalued options handled? Remembering the following are equivalent: HoldPkg = foo1 foo2 HoldPkg = foo3 and HoldPkg = foo1 foo2 foo3
# mutators conf_key_set() # sets a key without a value, as long as it doesn't exist. conf_keyval_set() # sets a key with a value, as long as it doesn't exist.
And of course, accessors return 0/1 to indicate existance, mutators return 0/1 to indicate write success/failure. If we start mutating the config file, I'll be a bit scared- I don't see a whole lot of need for these anytime soon.
-Dan
On Fri, Sep 30, 2011 at 11:03 PM, Dan McGee <dpmcgee@gmail.com> wrote:
On Fri, Sep 30, 2011 at 7:12 AM, Dave Reisner <d@falconindy.com> wrote:
At the risk of painting my shed green, I dislike your naming convention, particularly conf_key_set, which doesn't set anything at all. I would have thought that something such as the following would be appropriate:
# accessors conf_key_isset() # determines existance of a key w/o an associated value. conf_keyval_get() # gets the value of a key. These are pacman.conf only, right? Why not pacman_conf_get() pacman_conf_isset()
No they're pretty simple and general. But we should create pacman_conf_* functions that call these more general functions.
Another set of issues that may/may not be covered, but I didn't see anything at first glance: 1) is this [options] only, or are sections not even handled? Big problem if this is to be made universal, or one is looking up SigLevel... I think one needs to call this with both a file path and a section name. We probably also need a function that returns all the section names (pacman_conf_get_sections?).
I considered this- indeed, this is a reason I opted for reading from stdin, think something like this: pacman_conf_get_section $header | conf_key_val $key
2) How are multivalued options handled? Remembering the following are equivalent: HoldPkg = foo1 foo2 HoldPkg = foo3 and HoldPkg = foo1 foo2 foo3
These are not handled yet. We can write something like: conf_key_get_vals $key [$delim [$def]] ...just a variant of conf_key_val()
# mutators conf_key_set() # sets a key without a value, as long as it doesn't exist. conf_keyval_set() # sets a key with a value, as long as it doesn't exist.
And of course, accessors return 0/1 to indicate existance, mutators return 0/1 to indicate write success/failure. If we start mutating the config file, I'll be a bit scared- I don't see a whole lot of need for these anytime soon.
-Dan
config_parser.sh provides functions to support parsing simple config files, pacman.conf for example. Our scripts should never use a dirty eval to get config values such as DBPath. conf_key_get_val() is modified from pacman-key's get_from() function. All the parser functions implemented read config file from stdin, which makes chaining a breeze. Signed-off-by: lolilolicon <lolilolicon@gmail.com> --- scripts/library/README | 3 + scripts/library/config_parser.sh | 106 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 109 insertions(+), 0 deletions(-) create mode 100644 scripts/library/config_parser.sh diff --git a/scripts/library/README b/scripts/library/README index 1e9c962..4fb70dd 100644 --- a/scripts/library/README +++ b/scripts/library/README @@ -1,6 +1,9 @@ This folder contains code snippets that can be reused by multiple scripts. A brief description of each file follows. +config_parser.sh: +Parses simple config files such as pacman.conf. + output_format.sh: Provides basic output formatting functions with levels 'msg', 'msg2', 'warning' and 'error'. The 'msg' amd 'msg2' functions print to stdout diff --git a/scripts/library/config_parser.sh b/scripts/library/config_parser.sh new file mode 100644 index 0000000..73e40d3 --- /dev/null +++ b/scripts/library/config_parser.sh @@ -0,0 +1,106 @@ +# Parse value of a single-value key from a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# $2: default value if not defined in config file [optional] +# return: +# 0 - print value of key +# 1 - print nothing +conf_key_get_val() { + local key val + while IFS='=' read -r key val; do + # strip whitespaces + IFS=$' \t' read -r key <<< "$key" + IFS=$' \t' read -r val <<< "$val" + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key || -z $val ]] && continue + # return success as soon as a value is found + if [[ $key == "$1" ]]; then + printf '%s' "$val" + return 0 + fi + done + # fallback to default value if specified + if [[ -n $2 ]]; then + printf '%s' "$2" + return 0 + fi + return 1 +} + +# Parse values of a repeatable, multi-value key from a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# $2: default values if not defined in config file [optional] +# return: +# 0 - print values of key, one per line +# 1 - print nothing +conf_key_get_vals() { + local key val vals=() + while IFS='=' read -r key val; do + # strip whitespaces + IFS=$' \t' read -r key <<< "$key" + IFS=$' \t' read -r val <<< "$val" + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key || -z $val ]] && continue + # keep collecting values + if [[ $key == "$1" ]]; then + vals+=($val) + fi + done + # fallback to default values if specified + (( ${#vals[@]} )) || vals=($2) + (( ${#vals[@]} )) || return 1 + printf '%s\n' "${vals[@]}" + return 0 +} + +# Check whether an on/off key is set in a config file. +# +# stdin: config file +# $1: key - shall not contain '=', or start/end with whitespaces +# return: +# 0 - key is set +# 1 - key is not set +conf_key_isset() { + local key + while read -r key; do + # strip whitespaces + IFS=$' \t' read -r key <<< "$key" + # ignore comments and invalid lines + [[ ${key:0:1} == '#' || -z $key ]] && continue + # return success as soon as the key is found + if [[ $key == "$1" ]]; then + return 0 + fi + done + return 1 +} + +# Print the first section with the specified header from a config file. +# +# stdin: config file +# $1: header string - exact header string of the section +# $2: header pattern - regex matching only header strings +# return: +# 0 - print lines in section +# 1 - print nothing +conf_get_section() { + local line + while read -r line; do + # strip whitespaces + IFS=$' \t' read -r line <<< "$line" + # ignore comments and invalid lines + [[ ${line:0:1} == '#' || -z $line ]] && continue + # print section till EOF or next section + if [[ $line == "$1" ]]; then + while printf '%s\n' "$line"; do + read -r line || break + [[ $line =~ $2 ]] && break + done + return 0 + fi + done + return 1 +} -- 1.7.6.4
On Sat, Oct 1, 2011 at 3:01 AM, lolilolicon <lolilolicon@gmail.com> wrote:
config_parser.sh provides functions to support parsing simple config files, pacman.conf for example. Our scripts should never use a dirty eval to get config values such as DBPath.
conf_key_get_val() is modified from pacman-key's get_from() function. All the parser functions implemented read config file from stdin, which makes chaining a breeze.
Signed-off-by: lolilolicon <lolilolicon@gmail.com> --- scripts/library/README | 3 + scripts/library/config_parser.sh | 106 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 109 insertions(+), 0 deletions(-) create mode 100644 scripts/library/config_parser.sh
So I insist on reading from stdin. Don't mess with me :) I've added functions to address Dan's concerns. They are modelled towards pacman.conf but they should be general enough to be general. Upon these simple parser functions, we should build config file specific ones, named for example pacman_conf_*. Please don't be too picky on details. At this stage, the right thing to do is make sure this is going in the right direction...
participants (3)
-
Dan McGee
-
Dave Reisner
-
lolilolicon