Efficient File Copying with Rsync: Including Only Certain File Types

When it comes to efficient and reliable file synchronization and copying, Rsync stands out as a powerful tool. Not only does it allow you to transfer files between local and remote locations, but it also offers the flexibility to include or exclude specific files based on their types.

Here is the Rsync command for including certain file ypes while excluding the others:

rsync -zarv --include="*/" --include="*.sh" --exclude="*" "$from" "$to"

Explanation of the Rsync Options:

  • -z: Compress the data during the transfer, reducing the network bandwidth usage.
  • -a: Archive mode that preserves various file attributes such as permissions, timestamps, etc.
  • -r: Recursively copy directories.
  • -v: Verbose mode to display detailed information about the copying process.
  • –include=”*/”: This option includes all directories in the copying process. Without this, Rsync would skip directories as we are using specific includes.
  • –include=”*.sh”: This option includes all files with the “.sh” extension (shell script files). You can replace “*.sh” with any other file pattern you desire.
  • –exclude=”*”: This option excludes all files that do not match the patterns specified by the “–include” options.

Using the “-m” Option to Ignore Empty Directories:

The “-m” option in Rsync is used to prune empty directories during the file copying process. When you include the “-m” option, Rsync will remove any empty directories from the destination after the files have been copied. This is especially useful when you want to keep your destination directory clean and clutter-free, devoid of any empty folders that may have been present in the source.

Posted in Linux | Tagged | Leave a comment

How to select genomic region using samtools

To select a genomic region using samtools, you can use the faidx command. This command is used to index a FASTA file and extract subsequences from it.

Here is an example of how to use faidx to select a genomic region:

# index the FASTA file
samtools faidx genome.fa
 
# extract a specific region from the genome
samtools faidx genome.fa chr1:100-200

This will extract the subsequence from the genome located on chromosome 1, between base pairs 100 and 200. The output will be printed to the terminal, and you can redirect it to a file if you want to save it.

Alternatively, you can use the samtools view command to extract a genomic region from a BAM file. This command can be used to filter the alignments in a BAM file based on their position in the genome. Here is an example of how to use samtools view to extract a genomic region:

# extract alignments that overlap the region of interest
samtools view -b -h alignments.bam chr1:100-200 > region.bam
 
# convert the BAM file to FASTA format
samtools bam2fq region.bam > region.fasta

This will extract all of the alignments that overlap the specified genomic region and convert them to FASTA format. You can then use tools like fasta_tools or seqtk to manipulate the FASTA file as needed.

Posted in bioinformatics | Tagged | Leave a comment

Nice vscode extensions

Whether you are a new developer, or an expert developer, you should really consider using vscode https://code.visualstudio.com I have to admit I have been using VIM for years and I believe  will always use it. I tried some great tools such as PyCharm (For python) or Sublime. I had tried other all-in-one tool such as Aptana or Eclipse. I always droopped them after a while simply because I had a better feeling with my VIM environment. Not saying that those are not great tools (they have also improved a lot as compared to a few years back). However, the more I try vscode, the more impressed I am. Of course, nothing is perfect. I had to reinstall it today on my Fedora box. However, this was done so quickly I was ready to work in less than a few minutes. Here are some notes about some extensions I have installed.

Installation

First, as for the installation, I just followed the instructions on their web page and


sudo rpm install code

made the job. Just type code in a shell and we are ready to go. Since I reinstalled the vscodem I devided to remove the .vscode/ directory in my home to start fom scratch.

Tabnine

Once vscode started, I installed the TabNine; . In the right hand side, click on extensions, search for tabnince and install it.

Open a file (e.g. python) and start typing. You should see the completion. A combination of CRTL+space should provide a set of name for completion.

Neat !!

Python

I then installed the Pylance Python extension. When you will start z new project, think about setting up the correct environement.

gitgraph

I installed the gitgraph extension that looks quite promising. You can easily commit files, see the differences between version, see the deleted and changed files and so on. Looks promising. Of course, the most interesting to me was the graph side. To see it, click on “git graph” in the bottom panel.

github related

With respect to github extensions, I installed the one from Peter Knister that looks quite popular and general enough. It allows you to browse the Pull Request, issues and so on.

Another extension that looks great is the GitHub Repositories. This willl create a green icon in the right bottom corner, which allows you to browse a github repositories without cloning. You can quickly change your repositories that way. Actually, looks like you can also do it with others repositories.

Others

A simple yet maybe useful extensions is this Excel Viewer extension. Neat visualisation of your complex CSV files.

Posted in Python, Software development | Tagged | Leave a comment

conda list command broken (unknown variable: python_implementation)

Recently I had my conda environment raising a long error report when calling


conda list

One of the relevant error message refers to an unknown variable python_implementation.

raise SyntaxError('unknown variable: %s' % expr)
File "", line None
SyntaxError: unknown variable: python_implementation

My colleagues had no problems. From a fresh conda environment, no problem either. The fact is that some tools overwrote or deleted a package called xopen.

This fixed the problem for me:

conda install xopen

Posted in bioinformatics | Tagged | Leave a comment

ipython autocompletion does not work

With IPython version 7.12.0 and python 3.7.3, Ipython crashes when I wanted to autocomplete a file or a module import.

This fixed my problem:


pip install pyreadline

An the jedi package was set to 0.18.0 but this seems to cause problem so downgrading it solved the issue at the end:

pip install jedi==0.7.12

Posted in Computer Science | Tagged , | Leave a comment

Setup ssh server on Fedora Linux platform

It could be useful to set up a SSH server on your Linux box.
For instance, once set up, you can copy files from a Windows machine to your Linux box using ssh/ftp protocol.

It is actually quite easy. Here, we will consider a Fedora box.

The first step is to check whether the openssh-server is installed on your Fedora system. To do so execute the following command
sudo dnf install openssh-server

Next step is to enable systemd service sshd to make sure that SSH daemon will start after the reboot:

sudo systemctl enable sshd

Finally, start the ssh service (old way):

sudo service ssh start

For newer version of Linux:

sudo systemctl start sshd

Posted in Linux | Tagged , | Leave a comment

How to use subprocess with PIPEs

In Python, the subprocess module allows you to execute linux/unix commands directly from Python. The official Python documentation is very useful would you need to go further. Yet it may be a bit scary for newbies. Syntax may also change from one Python version to the other.

Here is a solution to use subprocess to encode piped commands with Python 3.7. For example imagine you want to encode this command that contains a pipe and a redirection:

    zcat file1.dat.gz file2.dat.gz | pigz > out.gz

Here is python example on how to implement this unix command using Python subprocess. The first call to the subprocess encodes the pipe. The second command encodes the redirection:

p1 = subprocess.Popen(["zcat", "file1.dat.gz", "file2.dat.gz"], 
                       stdout=subprocess.PIPE)
fout = open('out.gz', 'wb')
p2 = subprocess.run(['pigz'], stdin=p1.stdout, stdout=fout)

Note that the first subprocess uses the Popen method while the second call uses the run method. If you use the run() method for the pipe itself,

p1 = subprocess.run(['zcat', 'file1.dat.gz', 'file2.dat.gz'],
    stdout=subprocess.PIPE)
fout = open('out.gz', 'wb')
p2 = subprocess.run(['pigz'], stdin=p1.stdout, stdout=fout)

you would get this kind of error:

AttributeError: 'bytes' object has no attribute 'fileno'
Posted in Python | Tagged , | 1 Comment

Solved issue with the ultimate social media icons

In my wordpress blog, I suddenly got this error on every blog post:


Notice: Undefined index: sfsi_original_counts in /var/www/wp-content/plugins/ultimate-social-media-icons/libs/controllers/sfsiocns_OnPosts.php on line 45

This was due to the ultimaqte-social-media plugin. Hopefully, it could be solved easily by going to the “Ultimate Social Media Icons” link in the left menu. Then, in the page of the plugin, going to the section 6 (“Any other wishes for your main icons?”) and at the bootom of the page, switch the “Error reporting/suppress error message” to yet. Do not forget to presse the “save” button.

Et voilĂ  ! done.

Posted in wordpress | Tagged | Leave a comment

Fedora 32 installation and post installation

A new laptop ! Great, let’s install a new Linux Fedora version. Insteresting to see that from 4 years ago, the version number has jumped from 23 to 32. 10 version in 4 years. Anyway. This is surely for the best !

Installation

So, first step: get a USB key with at least 1.8Gb.

Then, download the file. Here I just downloading the Fedora 32 for x86_64 computer: download . The file is now available locally.

You will need to copy it to the USB key making sure the key is bootable. This can be done with mediawriter tool from fedora. Just install it (dnf install mediawriter).

Once, ready, reboot the laptop. Right at the beginning, before linux or windws starts, just press F2 to enter into the bios. There, select the USB key at the main boot device. Save and exit.

Fedora should now start from the USB. Following the instructions, you will need to select the harddrive on which to install fedora. In my case, I completely got rid of windows. Yet, before that I could not select any disk to install fedora. So, reboot, press F2 to enter into the bios again. There, you need to change the SATA Operation in the BIOS from RAID On to AHCI. Save and exit. Now it works, the disk is detected. Fedora drive you threough the rest of the installation.

Post installation

First update the system … yes, even though you’ve just installed it. This may take a while:


dnf install update

Some packages are not available (for fedora policy reasons) and need to be added manually. Let us add the “free repository” and “non-free” repository

sudo dnf install https://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm
sudo dnf install https://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm

You may also want to change your /etc/dnf/dnf.conf file to speed up future updates (by just checking the difference with previous version). Just add those two lines:

fastestmirror=true
deltarpm=true

For developers

On a daily basis, I use those tools and so I installed them straight away (you may skip this step):

dnf install git vim meld gvim

Concerning git, yo may want to configure it straightaway by adding your username and email:


git config --global user.name "Your Name"
git config --global user.email "toto@gmail"

gnome tweak

A tool to install common software and tweak the interface is available. It is called fedy. To install it, type:

sudo dnf copr enable kwizart/fedy

Then, type fedy to install fedy tool. Fedy is a front end user interface to help you tweak your fedora system. Check the apps and developments tools for programs to be installed. Personally, I installed pycharm (editor for python and more), slack and skype.

I also install the Adobe flash from the utilities tab for firefox and chrome as well as the multimedia codecs and microsoft truetype core fonts

Finally, fedy is useful to tweak gnomes.

How to minimize / maximize the windows

  • Start gnome-tweaks
  • go to the window titlebars tab
  • switch on the maximise and minimise titlebar buttons.

Other gnome tuning

  • Start tweak gnome.
  • Change the top bar>clock and set weekday and seconds
  • Change the top bar>battery Percentage on
  • In Extensions, swith on the “Applications menu”

Installing acroread for fedora 32:

You need to install some dependencies before:

sudo dnf install libstdc++.so.6 libpthread.so.0 libgdk_pixbuf-2.0.so.0 libGLU.so.1 libxml2.so.2 libXt.so.6 libatk-1.0.so.2 libfontconfig.so.1 libgtk-x11-2.0.so.0 libpangox-1.0.so.0 libidn.so.11 libgdk_pixbuf_xlib-2.0.so.6


wget http://ardownload.adobe.com/pub/adobe/reader/unix/9.x/9.5.5/enu/AdbeRdr9.5.5-1_i486linux_enu.rpm
sudo dnf install AdbeRdr9.5.5-1_i486linux_enu.rpm

Here you may have annoying warnings:

GTk-warning unable to locate theme engine in module_path: "adwaita", failed to load module pk-gtk-module and canberra-gtk module.


dnf install libcanberra-gtk2.i686 should solve the last warning.

This is an issue gtk2/gtk3. Both are installed under /usr/lib64/gtk-2.0 and /usr/lib64/gtk-3.0 and they both have a sub directory called modules/ check whether the pk-gtk, canberra are present.

In the gtk3 version, there is no pk-gtk file so let us look for gtk3 missing package. A dnf search showed a file called PackageKit-gtk3-module-1.1.9-3.fc28.i686

sudo dnf install PackageKit-gtk3-module-1.1.9-3.fc32.i686

Note that there are two flavor (i686 and x86_64). This solved my problem. Next the adwaita warning…

sudo dnf install gnome-themes-extra

vlc to watch videos


sudo dnf install vlc

reference: https://docs.fedoraproject.org/en-US/quick-docs/setup_rpmfusion/

Posted in Linux | Tagged | 1 Comment

how to set the logging in DEBUG mode when using the python requests package

By default when using the excellent requests package, there is not much on the screen to figure out was goes wrong with a specific URL requests. In general, you do not need it but from time to time it is useful to switch the logging level to e.g. DEBUG mode.


import requests
r = requests.post(YOUR_URL)

prints nothing on the screen. Now, what about some detais ? Since there is a logging system, let use get it:


requests.logging.getLogger()

This returns nothing special except a root logger. So the logging is actually happening outside of requests.
After some googling, I figured out that this is actcually happenning at a lower level (urllib).


import logging
log = logging.getLogger('urllib3')
log.setLevel(logging.DEBUG)

# logging from urllib3 to console
stream = logging.StreamHandler()
stream.setLevel(logging.DEBUG)
log.addHandler(stream)

This is now better since you have the url printed to the screen but this is not very helpful.
You need to set an extra debug level in the http package itself:

from http.client import HTTPConnection
HTTPConnection.debuglevel = 1
requests.get(YOUR_URL)

Now we get much more valuable information about the header, send and reply contents.

Posted in Computer Science, Linux, Python | Tagged | Leave a comment