Problem with a script - please help
- Inicie sesión ou rexístrese para enviar comentarios
Hi,
I have a problem with a script which I am writing for my final thesis. The thesis must be ready within two weeks, so I would really appreciate if you could help me. My thesis is about h-node.
The script will deal with two different logs of user actions to figure out what kind of hardware contributions I have made to h-node. Specifically, I am trying to count how many contributions I have made per each hardware class.
I have two source files for the script. Both contain lines of data which I am trying to process. Here are some examples of the content:
Source file 1:
Format: the model DEVICE_NAME has been [updated|inserted] by USER_NAME at DATE_AND_TIME
the model E220 HSDPA Modem / E270 HSDPA/HSUPA Modem has been updated by lammi87 at 22:48, 18 May 2012
Source file 2:
Format: HARDWARE_CLASS DEVICE_NAME
notebook ProBook 6555b
Source file 1 contains most of the stuff I need. The only thing it does not have is the HARDWARE_CLASS associated with each line, so I have to get it from the source file 2.
Source file 1 has a line per each contribution I have made but the source file 2 does not. Even when I make multiple contributions to one device, only one line containing that DEVICE_NAME is found in the source file 2.
So, I need to count how many times a DEVICE_NAME found in source 2 appears on source 1 per each HARDWARE_CLASS.
The problem is this:
I don't seem to get the array in my script to work correctly. I am trying to save all the DEVICE_NAMES (which can contain white spaces and special characters like ', (, ), [ and ] and extra tabs and white spaces at the end of each line) to the array as one DEVICE_NAME per each slot in the array by running a for loop.
Then I will run another for loop within the other one to count how many times a line containing a particular DEVICE_NAME is present in source 1. The problem is that no matter what I try, I can't get the array to work.
Sorry for long explanation. Example source files and the script can be found as attachments below.
Anexo | Tamaño |
---|---|
script.sh | 819 bytes |
Source_file_1.txt | 558 bytes |
Source_file_2.txt | 287 bytes |
Here's the script again.
Anexo | Tamaño |
---|---|
script.txt | 819 bytes |
http://mtjm.eu/patches/script.diff shows one possibility of fixing it.
DEVICE_NAMES is not an array; the extra trailing whitespace makes the
match fail.
cat ... | grep ... | sed ... can be replaced by one invocation of sed;
-r makes it use the more common extended regexp syntax.
Device names contain special characters, so they need shell quoting and
matching via fgrep. Since fgrep searches for lines matching any pattern
line, the inner loop can be simplified.
bash -ex is very useful when debugging such scripts.
Thank you very much for your answer. It works better than anything that I have done. However, I noticed a problem. After running the updated script with my real source files 1 and 2, I get a total value of 730 contributions. This does not match with a result I get from a different part of the script.
I calculate the total value of contributions made by me by counting the lines from source 1 which contain my user name. When I do that, I get 912 as the result. These don't match. Do you have any idea why?
Again, thanks for your help, but could you spare a little more of your time, please?
Anexo | Tamaño |
---|---|
script.txt | 828 bytes |
source-hardware-1.txt | 147.43 KB |
source-hardware-2.txt | 22.83 KB |
There is no check for the contributions being yours, there are 730 your
contributions and 912 (or 913?) contributions of all users with devices
listed in the other file.
I've implemented http://mtjm.eu/patches/ccontr.py to check that this
isn't a shell escape or grep issue (it outputs to standard output
instead of using the third argument).
I can check whether any single contribution was made by me by using my user name:
I'll just change this:
let "NUM=$(fgrep -c "$DEVICE_NAMES" $1 )"
into this
let "NUM=$(grep -F "$DEVICE_NAMES" $1 | grep -c "lammi87" )"
It gave me 728. I'd say it is sufficient enough. I'll just have to point out that the figures I get with my scripts are not 100% accurate but accurate enough.
Thanks for your help! You helped a lot!
>My thesis is about h-node.
That sounds interesting, would be nice to read it once you're done.
That's nice to hear. I'll be sure to post it here when it is done and has been uploaded to my school's thesis system. It should be around three weeks.
My thesis aims to be an introduction to h-node for new users and spread knowledge about it and free software too. I also aim to extend h-node's hardware database. So nothing ground breaking in developing new features for h-node, I'm afraid.
- Inicie sesión ou rexístrese para enviar comentarios