Technical notes taken during my software developments for ALMA and EVLA.

This WEB is a place where I decided to put some technical informations during my software developments for ALMA and EVLA. I find this WEB is a nice way to retrieve informations which I considered as precious during my every day work. It's also a place where I may decide to share thoughts and comments about certains technical aspects of my activity; these thoughts and comments will reflect only my personal choices and visions.

-- MichelCaillat - 02 Feb 2012


Table of contents


Working on remote computers.

How to login on the 'server' machines at the Observatory of Paris from outside.

Let's take the example of an ssh session on ozone.obspm.fr while I am with my mac outside of the network of the Observatory.
Firstly in a local shell open an ssh tunnel via one of the gatekeepers machines, e.g. styx:
ssh -N -f -L 2222:ozone.obspm.fr:22 caillat@styx.obspm.fr

I am prompted to enter my password, which I do . After that I am left with an ssh running in the background which will perform the tunneling. The port number 2222 can be actually any positive integer value above the range of reserved values. The port 22 is the standard port number for ssh .
After that I can run as many ssh sessions that I want on ozone.obspm.fr in that way :
ssh -Y -p 2222 alma@localhost

With the tunnel created previously an ssh with port 2222 on localhost will actually proceed on ozone.obspm.fr , on the alma account in the example.

How to scp through a tunnel.

Reusing the situation described above, one creates the same tunnel with the same command and then one can use scp:
scp -P 2222 somefile account@localhost:remote-path

Please pay attention to the fact that the port is specified by a -P (capital P) while it's specified by a -p (lowercase) in the ssh command !

Another example , when I needed to copy from aramis a file APO* to my Mac at home via rubicon :
ssh -l caillat -fN -L 4567:aramis.obspm.fr:22 rubicon.obspm.fr
scp -P 4567 caillat@localhost:APO* .

Don't forget to cancel the ssh process started once you have created the tunnel, it's still running in the background and blocks the port number you assigned to him.

How to log in on the machines at the OSF from my local computer using nxclient.

In the first place, visit this link http://wikis.alma.cl/bin/view/AIV/RemoteAccessToOSFSTEs and see how to configure your nxclient application.
In my case, firstly open an ssh tunnel :
ssh -N -f -L2222:tfa-gns.aiv.alma.cl:22 mcaillat@tatio.aiv.alma.cl

by giving the appropriate password. (my usual one). Then log on localhost port 2222 as mcaillat with the nxclient application (the one provided by OSF).

How to login on the machines at the OSF with ssh using a config file.

Have a config file under .ssh with a content as follows :
Host osf-red.aiv.alma.cl
 User mcaillat
 HostName osf-red.aiv.alma.cl
 ProxyCommand ssh mcaillat@login.alma.cl nc %h %p 2> /dev/null
Host tfint-gns.aiv.alma.cl
 User mcaillat
 HostName tfint-gns.aiv.alma.cl
 ProxyCommand ssh mcaillat@login.alma.cl nc %h %p 2> /dev/null
Host aos-gns.aiv.alma.cl
 User mcaillat
 HostName aos-gns.aiv.alma.cl
 ProxyCommand ssh mcaillat@login.alma.cl nc %h %p 2> /dev/null


Working for ALMA.

Where are all the FBT Twiki pages ?

They are here.

The kind of definitions that I like to have in my .bashrc in order to play with the different development branches for ALMA.

#
# ACS Setup.
#
alias acssetup=". $HOME/.acs/.bash_profile.acs"
alias introot="export INTROOT=\"${HOME}/introot\"; acssetup; export ACS_CDB=\"/soft/ALMA/ICD/HLA/ASDM/test\"; export PATH=/usr/local/bin:\${PATH}"
alias introotALMA800B="export INTROOT=\"${HOME}/introotALMA800B\"; acssetup; export ACS_CDB=\"/soft/ALMA/ALMA-8_0_0-B/ICD/HLA/ASDM/test\"; export PATH=/usr/local/bin:\${PATH}"
alias introotALMA803B="export INTROOT=\"${HOME}/introotALMA803B\"; acssetup; export ACS_CDB=\"/soft/ALMA/ALMA-8_0_3-B/ICD/HLA/ASDM/test\"; export PATH=/usr/local/bin:\${PATH}"
alias introotALMA810B="export INTROOT=\"${HOME}/introotALMA810B\"; acssetup; export ACS_CDB=\"/soft/ALMA/ALMA-8_1_0-B/ICD/HLA/ASDM/test\"; export PATH=/usr/local/bin:\${PATH}"

alias introotDef="echo INTROOT=\${INTROOT} ; echo ACS_CDB=\${ACS_CDB}"

Which modules do I have to build ?

In that order :
  • ICD/HLA/Enumerations
  • ICD/HLA/ASDM
  • ICD/HLA/ASDMBinaries
  • OFFLINE/Utils

Where is ACS ?

It's here

with (wfhpp00fs) shhhhht !

How to create local copies of modules stored in SVN repositories ?

Firstly cd where I want to create my local copies. E.g. :

cd /soft/ALMA/

Then make local copies of the modules. Before all make sure that the target directories exist.

Enumerations:
svn co https://alma-svn.hq.eso.org/p2/trunk/ICD/HLA/Enumerations ./ICD/HLA/Enumerations

ASDM:
svn co https://alma-svn.hq.eso.org/p2/trunk/ICD/HLA/ASDM ./ICD/HLA/ASDM

ASDMBinaries:
svn co https://alma-svn.hq.eso.org/p2/trunk/ICD/HLA/ASDMBinaries ./ICD/HLA/ASDMBinaries

OFFLINE/Utils:
svn co https://alma-svn.hq.eso.org/p2/trunk/OFFLINE/Utils ./OFFLINE/Utils

Where are the (CVS) local copies of these modules ?

For a given CVS development branch all the modules listed above have local copies rooted in a same directory say <..> :
  • <..>/ICD/HLA/Enumerations
  • <..>/ICD/HLA/ASDM
  • <..>/ICD/HLA/ASDMBinaries
  • <..>/OFFLINE/Utils
How do I build ICD/HLA/Enumerations ?

cd <..>

How I find which files must be compared during a merge operation for ICD/HLA/ASDM (e.g. TRUNK , SomeBranch -> TRUNK)

I use a diff command with a lot of -x options to eliminate all the files irrelevant for the comparison :

For ICD/HLA/ASDM :
diff -q -r -x CVS -x object -x doc -x .cvsignore -x *.so -x swig -x *.xsd -x bin -x lib -x test -x *_SA.cpp -x *Table.* -x *Row.* -x semantic* -x *.cpp.~* -x *.h.~* -x Merger.* -x Parser.* -x enumerations -x *.cpp~ -x .settings -x .project -x .py* -x *# -x *.h~ -x *.~ -x ASDM.java -x *.dump -x AsdmObject.py -x rtai /soft/ALMA/ICD/HLA/ASDM/ /soft/ALMA/ALMA-9_0-B/ICD/HLA/ASDM/

For ICD/HLA/Enumerations :
 diff -q -r -x CVS -x object -x doc -x .cvsignore -x *.so -x swig -x *.xsd -x bin -x lib -x test -x C*.cpp -x C*.h -x semantic* -x *.cpp.~* -x *.h.~* -x *.cpp~ -x .settings -x .project -x .py* -x *# -x *.h~ -x *.~ -x ASDM.java -x *.dump -x AsdmObject.py /soft/ALMA/ICD/HLA/Enumerations/ /soft/ALMA/ALMA-9_0-B/ICD/HLA/Enumerations/

For ICD/HLA/ASDMBinaries :
 diff -q -r -x CVS -x object -x doc -x .cvsignore -x *.so -x *.xsd -x bin -x lib -x test -x semantic* -x *.cpp.~* -x *.h.~* -x *.cpp~ -x .settings -x .project -x *# -x .#* -x *.h~ -x *.~ /soft/ALMA/ICD/HLA/ASDMBinaries/ /soft/ALMA/ALMA-9_0-B/ICD/HLA/ASDMBinaries/

This was used for the merge TRUNK, ALMA-9_0-B -> ALMA-9_0-B. It can be improved but it's a good starting point.

How do I list the names (without path nor extension) of a collection of files in bash ?

for f in `ls /soft/ALMA/ICD/HLA/ASDM/include/*Table.h`; do filename=$(basename $f);  filename=${filename%.*}; echo $filename; done

The example above would give the list of all the filenames corresponding to *Table.h and located in /soft/ALMA/ICD/HLA/ASDM/include


How do I work at the border between ALMA and CASA ?

Files generated in ICD/HLA/ASDM but used by the filler (CASA).PICK

The filler has an option 'asis' which consists in producing verbatim copies of the ASDM tables in MS tables. This option uses a C++ code generated whenever the code generation is activated in ICD/HLA/ASDM. The files containing this code are ASDMTables.cpp, Name2Table.cpp and ASDMTables.h and their locations on each side (ALMA and CASA) are :
ALMA/ACS CASA
<..>/ICD/HLA/ASDM/object/antbuild/tmp/src/Name2Table.cpp /opt/casa/trunk/code/alma/apps/asdm2MS/Name2Table.cc
<..>/ICD/HLA/ASDM/object/antbuild/tmp/src/ASDMTables.cpp /opt/casa/trunk/code/alma/apps/asdm2MS/ASDMTables.cc
<..>/ICD/HLA/ASDM/object/antbuild/tmp/include/ASDMTables.h

/opt/casa/trunk/code/alma/apps/asdm2MS/ASDMTables.h

Don't forget to update the files on the CASA whenever a change in the ALMA's side occurs

ASDM.

How to compare *Table.h between ICD/HLA/ASDM/include and /opt/casa/trunk/code/alma/implement/ASDM ?

LEFTDIR=/soft/ALMA/ICD/HLA/ASDM/include; RIGHTDIR=/mnt/hgfs/casa/trunk/code/alma/ASDM; for f in `ls $LEFTDIR/*Table.h`; do filename=$(basename $f);  filename=${filename%.*}; diff $LEFTDIR/$filename.h $RIGHTDIR/$filename.h; done

How to compare *Table.cpp in ICD/HLA/ASDM/src and *Table.cc in /opt/casa/trunk/code/alma/implement/ASDM ?

 LEFTDIR=/soft/ALMA/ICD/HLA/ASDM/src; RIGHTDIR=/mnt/hgfs/casa/trunk/code/alma/ASDM; for f in `ls $LEFTDIR/*Table.cpp`; do filename=$(basename $f);  filename=${filename%.*}; diff $LEFTDIR/$filename.cpp $RIGHTDIR/$filename.cc; done

How to compare *Row.h between ICD/HLA/ASDM/include and /opt/casa/trunk/code/alma/implement/ASDM ?

LEFTDIR=/soft/ALMA/ICD/HLA/ASDM/include; RIGHTDIR=/mnt/hgfs/casa/trunk/code/alma/ASDM; for f in `ls $LEFTDIR/*Row.h`; do filename=$(basename $f);  filename=${filename%.*}; diff $LEFTDIR/$filename.h $RIGHTDIR/$filename.h; done

How to compare *Row.cpp in ICD/HLA/ASDM/src and *Row.cc in /opt/casa/trunk/code/alma/implement/ASDM ?

LEFTDIR=/soft/ALMA/ICD/HLA/ASDM/src; RIGHTDIR=/mnt/hgfs/casa/trunk/code/alma/ASDM; for f in `ls $LEFTDIR/*Row.cpp`; do filename=$(basename $f);  filename=${filename%.*}; diff $LEFTDIR/$filename.cpp $RIGHTDIR/$filename.cc; done

How to compare *.cpp in ICD/HLA/ASDM/src and *.cc in /opt/casa/trunk/code/alma/implement/ASDM (ignoring *_SA.cpp) ?

shopt -s extglob
LEFTDIR=/soft/ALMA/ICD/HLA/ASDM/src; RIGHTDIR=/mnt/hgfs/casa/trunk/code/alma/ASDM; for f in `ls $LEFTDIR/!(*_SA).cpp`; do filename=$(basename $f);  filename=${filename%.*}; diff -q $LEFTDIR/$filename.cpp $RIGHTDIR/$filename.cc; done

ASDMBinaries

How to compare ICD/HLA/ASDMBinaries/src/*.cpp and *.cc in /opt/casa/trunk/code/alma/implement/ASDMBinaries ?

 LEFTDIR=/soft/ALMA/ICD/HLA/ASDMBinaries/src; RIGHTDIR=/mnt/hgfs/casa/trunk/code/alma/ASDMBinaries; for f in `ls $LEFTDIR/*.cpp`; do filename=$(basename $f);  filename=${filename%.*}; diff $LEFTDIR/$filename.cpp $RIGHTDIR/$filename.cc; done

How to compare ICD/HLA/ASDMBinaries/include/*.h and *.h in /opt/casa/trunk/code/alma/implement/ASDMBinaries ?

LEFTDIR=/soft/ALMA/ICD/HLA/ASDMBinaries/include; RIGHTDIR=/mnt/hgfs/casa/trunk/code/alma/implement/ASDMBinaries; for f in `ls $LEFTDIR/*.h`; do filename=$(basename $f);  filename=${filename%.*}; diff $LEFTDIR/$filename.h $RIGHTDIR/$filename.h; done

Working on CASA.

Where do I find the CASA distros ?

At https://svn.cv.nrao.edu/casa/linux_distro/

and also

https://svn.cv.nrao.edu/svn/casa/trunk

https://svn.cv.nrao.edu/svn/casa/branches/

How do I build and install casacore, code, ..., etc on my Mac ?

Everything you need to know is very likely on Scott Rankin's page.

https://safe.nrao.edu/wiki/bin/view/Main/ScottRankinCasaBuildNotes

But, but, but

On Lion (! MacOSX? 10.7) Do not forget to execute /opt/casa/darwin11/casainit.sh prior to anything else!

How do I install the third party packages required by CASA on Linux EL 5.7 ?

I use yum.
  • Firstly check that the yum repo description for CASA is as follows :
[caillat@localhost build]$ more /etc/yum.repos.d/casa.repo 
[casa]
name=CASA RPMs for RedHat Enterprise Linux 5 (x86_64)
baseurl=http://svn.cv.nrao.edu/casa/repo/el5/x86_64/
gpgkey=http://svn.cv.nrao.edu/casa/RPM-GPG-KEY-casa http://www.jpackage.org/jpackage.asc http://svn.cv.nrao.edu/casa/repo/el5/RPM-GPG-KEY-redhat-release http://sv
n.cv.nrao.edu/casa/repo/el5/RPM-GPG-KEY-EPEL
enabled = 1
  • Secondly launch the following command with root privileges.
sudo yum install --nogpgcheck antlr-c++-devel antlr-c++-shared casapy-boost141 casapy-boost141-devel casapy-python casapy-python-devel cfitsio-devel dbus-c++ fftw3 fftw3-devel qt471-devel qt471-qwt-devel rpfits tix tix-devel wcslib xerces-c xerces-c-devel aatm aatm-devel dbus-c++-devel blas-devel cmake almawvr casapy-swig lapack-devel pgplot pgplot-devel wcslib wcslib-devel gsl gsl-devel

-- MichelCaillat - 06 Jun 2013

How do I install the third party packages required by CASA on Linux EL 6.7 ?

I use yum.
  • Firstly check that the yum repo description for CASA is as follows :
[caillat@localhost ~]$ cat /etc/yum.repos.d/casa.repo 
[casa]
name=CASA RPMs for RedHat Enterprise Linux 6 (x86_64)
baseurl=http://svn.cv.nrao.edu/casa/repo/el6/x86_64
gpgcheck=0
gpgkey=http://svn.cv.nrao.edu/casa/RPM-GPG-KEY-casa http://www.jpackage.org/jpackage.asc http://svn.cv.nrao.edu/casa/repo/el6/RPM-GPG-KEY-redhat-release http://svn.cv.nrao.edu/casa/repo/el6/RPM-GPG-KEY-EPEL
  • Secondly launch the following command with root privileges.
yum install gcc-c++ tar subversion curl aatm-devel blas-devel casa01-dbus-cpp-devel casa01-dvi2tty casa01-python-devel casa01-qwt-devel casa01-swig cmake fftw-devel flex bison gsl-devel lapack-devel libxml2-devel wcslib-devel xerces-c28-devel cfitsio-devel boost-devel plplot-c-devel pgplot-devel java-1.6.0-openjdk libxslt-devel rpfits readline-devel sqlite-devel pgplot-devel libsakura devtoolset-3-memstomp-0.1.5-3.el6.x86_64 devtoolset-3-elfutils-libs-0.161-1.el6.x86_64 devtoolset-3-toolchain-3.0-15.el6.noarch devtoolset-3-strace-4.8-8.el6.x86_64 devtoolset-3-ltrace-0.7.91-10.el6.x86_64 devtoolset-3-dwz-0.11-1.1.el6.x86_64 devtoolset-3-runtime-3.0-15.el6.noarch devtoolset-3-libstdc++-devel-4.9.2-6.el6.x86_64 devtoolset-3-elfutils-libelf-0.161-1.el6.x86_64 devtoolset-3-gdb-7.8.2-38.el6.x86_64 devtoolset-3-elfutils-0.161-1.el6.x86_64 devtoolset-3-gcc-4.9.2-6.el6.x86_64 devtoolset-3-gcc-gfortran-4.9.2-6.el6.x86_64 devtoolset-3-libquadmath-devel-4.9.2-6.el6.x86_64 devtoolset-3-binutils-2.24-18.el6.x86_64 devtoolset-3-gcc-c++-4.9.2-6.el6.x86_64 casa01-mpi4py-1.3.1-1.el6.x86_64 casa01-openmpi-1.6.5-7.el6.x86_64

How do I update the data repository ?

This has to be done typically when messages warning about a possible error of precision are output by the filler. This means that ephemerides data have to be updated.
cd <..>/data
rsync -avz rsync.aoc.nrao.edu::casadata .

How do I run the CASA unit tests ?

Sandra Castro has written a useful page about this.

Where do I find all sort of informations for a CASA developers ?

https://safe.nrao.edu/wiki/bin/view/Software/CasaIndex

How do I extract an ASDM dataset from the EVLA archive ?

Assuming that I know the name of the dataset (example : 10B-137_sb1924711_1.55477.98106888889) :
  • I open a WEB browser at https://archive.nrao.edu/archive/advquery.jsp
  • I fill the field labeled "Archive File ID" with the name of the dataset (in the middle of the section "General Search Parameters ") and click on button "Submit Query"
  • Then if the dataset is found in the Archive, I am taken to a new page where I have to provide some others personal informations and few more selection criteria before the data are actually extracted and temporarily stored in an area which I can read with anonymous FTP. A mail is sent to me when the data are ready to be grabbed via FTP.
Note that the interface allows me to request the full ASDM dataset ( tables + visibilities ) or only the tables.

How did I create a new CASA module (alma_v3) ?

Such a situation appeared when I had to maintain two concurrent versions of the filler, one per version of the ASDM. Beside the current one named alma, I wanted to create a new one named alma_v3containing all the software related with ASDM version 3. The sequence of operations which I had to figure out by myself has been the following :
  • Create a hierarchy of directories under /opt/casa/trunk/code/alma_v3 :
/opt/casa/trunk/code/alma_v3
   |---prop-base
   |---props
   |---text-base
   |---tmp
   |-----prop-base
   |-----props
   |-----text-base
   |-apps
   |---asdm2MS_v3
   |-implement
   |---ASDM
   |---ASDMBinaries
   |---Enumerations
   |---Enumtcl

which was inspired from what is under /opt/casa/trunk/code/alma.
  • Populate ASDM, ASDMBinaries, Enumerations, Enumtcl, apps/asdm2MS_v3 with the source codes (see How I port software changes from the ALMA/ACS world to the CASA's)
  • It is very important to notice that the source code of the filler has to be named exactly like the directory which contains it. In our case apps/asdm2MS_v3 must contain asdm2MS_v3.cc
  • In /opt/casa/trunk/code/CMakeFiles.txt add a line to add the module alma_v3 . When this text is written the section (lines 977-1007) where modules are added looks like :
# The modules must be defined in dependency order!
# This will set up include paths, and which libraries to link to

# for linux RPM & binary distros libgraphics contains pgplot (linked in)...
if( APPLE )
  casa_add_module( graphics CASACORE PGPLOT WCSLIB )
else()
  if( NOT SKIP_PGPLOT )
    casa_add_module( graphics CASACORE PGPLOT X11 WCSLIB )
  else()
    casa_add_module( graphics CASACORE X11 WCSLIB )
  endif()
endif()
casa_add_module( casadbus CASACORE DBUS )
casa_add_module( tableplot CASACORE PYTHON )
casa_add_module( msvis CASACORE graphics Boost )
casa_add_module( casaqt CASACORE QT4 QWT PYTHON XERCES graphics )
casa_add_module( plotms CASACORE QT4 casaqt msvis Boost )
casa_add_module( display CASACORE WCSLIB QT4 casaqt msvis Boost )
#casa_add_module( display3d CASACORE QT4 OPENGL display )
casa_add_module( flagging CASACORE Boost tableplot msvis )
casa_add_module( calibration CASACORE tableplot msvis Boost )
casa_add_module( synthesis CASACORE ATM casadbus msvis Boost calibration )
casa_add_module( alma CASACORE LIBXML2 Boost )
casa_add_module( alma_v3 CASACORE LIBXML2 LIBXSLT Boost synthesis )
casa_add_module( oldalma CASACORE LIBXML2 Boost )
casa_add_module( dish CASACORE )
casa_add_module( nrao CASACORE )
casa_add_module( spectrallines CASACORE )
casa_add_module( xmlcasa CASACORE CCMTOOLS PYTHON ATM READLINE DL plotms display flagging synthesis dish nrao spectrallines )

  • In opt/casa/trunk/code/include create a link ./alma_v3 on /opt/casa/trunk/code/alma_v3 :
volte:include michel$ pwd
/opt/casa/trunk/code/include
volte:include michel$ ls -l
total 168
lrwxr-xr-x  1 michel  admin  17  6 mai 17:27 alma -> ../alma/implement
lrwxr-xr-x  1 michel  admin  21 10 mai 11:14 alma_v3 -> ../alma_v3/implement/
lrwxr-xr-x  1 michel  admin  17  6 mai 17:27 atnf -> ../atnf/implement
lrwxr-xr-x  1 michel  admin  24  6 mai 17:27 calibration -> ../calibration/implement
lrwxr-xr-x  1 michel  admin  21  6 mai 17:27 casadbus -> ../casadbus/implement
lrwxr-xr-x  1 michel  admin  19  6 mai 17:27 casaqt -> ../casaqt/implement
lrwxr-xr-x  1 michel  admin  17  6 mai 17:27 demo -> ../demo/implement
lrwxr-xr-x  1 michel  admin  17  6 mai 17:27 dish -> ../dish/implement
lrwxr-xr-x  1 michel  admin  20  6 mai 17:27 display -> ../display/implement
lrwxr-xr-x  1 michel  admin  22  6 mai 17:27 display3d -> ../display3d/implement
lrwxr-xr-x  1 michel  admin  21  6 mai 17:27 flagging -> ../flagging/implement
lrwxr-xr-x  1 michel  admin  21  6 mai 17:27 graphics -> ../graphics/implement
lrwxr-xr-x  1 michel  admin  18  6 mai 17:27 msvis -> ../msvis/implement
lrwxr-xr-x  1 michel  admin  17  6 mai 17:27 nrao -> ../nrao/implement
lrwxr-xr-x  1 michel  admin  20  6 mai 17:27 oldalma -> ../oldalma/implement
lrwxr-xr-x  1 michel  admin  19  6 mai 17:27 plotms -> ../plotms/implement
lrwxr-xr-x  1 michel  admin  23  6 mai 17:27 singledish -> ../singledish/implement
lrwxr-xr-x  1 michel  admin  26  6 mai 17:27 spectrallines -> ../spectrallines/implement
lrwxr-xr-x  1 michel  admin  22  6 mai 17:27 synthesis -> ../synthesis/implement
lrwxr-xr-x  1 michel  admin  22  6 mai 17:27 tableplot -> ../tableplot/implement

  • Attention this part was the most difficult to find out since it's specific to alma (whatever its version) and some other modules. The file to be edited is /opt/casa/trunk/code/install/target.cmake. It has a section where a specific definition is done include directories (lines 236-249)
# Special case for (old)alma(_v3)                                                                                                                                   
    if( ${module} STREQUAL "alma" OR
    ${module} STREQUAL "oldalma" OR
        ${module} STREQUAL "alma_v3")

      set( ${module}_INCLUDE_DIRS
        ${${module}_INCLUDE_DIRS}
    ${CMAKE_SOURCE_DIR}/include/${module}/ASDM
        ${CMAKE_SOURCE_DIR}/include/${module}/ASDMBinaries
    ${CMAKE_SOURCE_DIR}/include/${module}/Enumtcl
        ${CMAKE_SOURCE_DIR}/include/${module}/Enumerations
        )

    endif()

Once one has been gone through all these steps there are very serious chances to get the library libalma_v3 the application asdm2MS_v3 built by the standard procedure using cmake and make to build code.


Editing.

How could I obtain the braces in Emacs (23.2) on my Mac ?

Added this in my .emacs :
(setq mac-option-modifier nil
     mac-command-modifier 'meta
     x-select-enable-clipboard t)

How could I get a satisfying keyboard behaviour in a VMWare Fusion virtual machine running linux ( SL 5.x ) ?

1) In the Preferences of VMWare Fusion, the item "Clavier et souris" looks like this :

KM.2.gif

KM.3.gif

2) In Scientific Linux 5.x, open a terminal and then in 'Système->Préférences->Clavier' ensure that the thumbnails 'Agencements' shows this :

Clavier-Proprie769te769sDuClavier-Agencements-SL.gif

Programming.

In C++

Where did I find interesting documentation about how to program libxml and libxslt ?

Thank you John Fleck.

Debugging.

How to obtain core dumps on Mac OS X ?

ulimit -c unlimited

and the core dump files are written in /cores as core.<pid>

Testing and experimenting.

Playing with boost regex and regex_match in two parallel threads.

#include <boost/regex.hpp>
using namespace boost;

#include <iostream>
using namespace std;

int main(int argC, char* argV[]) {
  pthread_t thread1, thread2;
  void * testRegex(void * ptr);
  int iret1 = pthread_create(&thread1, NULL, testRegex, (void*) argV[1]);
  int iret2 = pthread_create(&thread2, NULL, testRegex, (void*) argV[1]);
  cout << "Waiting ..." << endl;    pthread_join(thread1, NULL);  pthread_join(thread2, NULL);
  sleep(5);
  return 0;
}

void* testRegex(void * ptr) {
  regex expression("^[uU][iI][dD]://[0-9a-zA-Z]+(/[xX][0-9a-fA-F]+){2}(#\\w{1,}){0,}$");
  const char *  theUID_p = (char *) ptr;
  string        result ;
  cout << "Checking " << theUID_p << endl;
  for (unsigned int i = 0; i < 20000; i++) {
    int status = 0;
    cmatch what;
    if (!regex_match(theUID_p, what, expression)) {
      cout << "Probleme" << endl;
      break;
    }
  }
} 

/*
Compiled and linked as follows on my Linux SL 55 (32 bits) :
c++ -g -o TestRegexThreadSafe TestRegexThreadSafe.cc -lboost_regex -lpthread

Compiled and linked as follows on Mac OS X Snow Leopard :
c++ -g -o TestRegexThreadSafe -I/opt/casa/darwin10-64b/include/ TestRegexThreadSafe.cc -L/Applications/CASA.app/Contents/Frameworks/  -lboost_regex -lpthread
 
Example of run : 
./TestRegexThreadSafe uiD://X123456/XabCdE/X0fE

*/

Opening two ASDM in two parallel threads.

#include <ASDM.h>
#include <ScanTable.h>
#include <CalAtmosphereTable.h>
#include <iostream>
#include <stdlib.h>
#include "boost/filesystem.hpp"
using namespace asdm;
int main(int argc, char *argv[]) {
 if (argc<3) {   cout<<"Usage: "<< boost::filesystem::basename(argv[0])<<" asdm1 asdm2"<<endl;   exit(0); }
 pthread_t thread1, thread2; void *asdmProcess( void *ptr);
 /* Create independent threads each of which will execute function */
 int iret1 = pthread_create( &thread1, NULL, asdmProcess, (void*) argv[1]);
 int iret2 = pthread_create( &thread2, NULL, asdmProcess, (void*) argv[2]);
 std::cout<<"waiting..."<<std::endl;
 pthread_join( thread1, NULL);
 pthread_join( thread2, NULL);
 sleep(10);
 exit(0);
 return 0;
}
void *asdmProcess( void *ptr ){ string asdmDirectory = string((char *) ptr); std::cout<<"asdm="<<asdmDirectory<<std::endl;
 ASDM * asdm = new ASDM(); asdm->setFromFile(asdmDirectory,false);  // loadTablesOnDemand=false
 ScanTable &scan = asdm->getScan(); std::cout<<asdmDirectory<<": scan.size()="<<scan.size()<<endl;
 CalAtmosphereTable &calAtmosphere = asdm->getCalAtmosphere(); std::cout<<asdmDirectory<<": calAtmosphere.size()="<<calAtmosphere.size()<<endl;
 return 0;
}
/*
Compiled and linked as follows on my iMac Snow Leopard:
c++ -DWITHOUT_ACS -g -o TestThreadedASDM TestThreadedASDM.cc -I/opt/casa/darwin10-64b/include/ -I/opt/casa/trunk/code/alma/implement/ASDM -I/opt/casa/trunk/code/alma/implement/Enumerations -L/Applications/CASA.app/Contents/Frameworks/ -lboost_filesystem -lboost_system -L/opt/casa/trunk/darwin64/lib/ -lalma

Compiled and linked as follows on my Macbook Proc Lion:
c++ -DWITHOUT_ACS -g -o TestThreadedASDM TestThreadedASDM.cc -I/opt/casa/darwin11/include/ -I/opt/casa/trunk/code/alma/implement/ASDM -I/opt/casa/trunk/code/alma/implement/Enumerations -L/Applications/CASA.app/Contents/Frameworks/ -lboost_filesystem -lboost_system -L/opt/casa/trunk/darwin11/lib/ -lalma

Compiled and linked as follows on my Linux SL 55 (32 bits) with the classical
ACS environmnent variables defined and casa supposed to be installed under ALMA_INSTDIR.

c++ -DWITHOUT_ACS -g -o TestThreadedASDM TestThreadedASDM.cc -I$ALMASW_INSTDIR/casa/include -I$INTROOT/include -L$ALMASW_INSTDIR/casa/lib -lboost_fi
lesystem -lboost_system -lboost_regex -lxml2 -lxslt  -L$INTROOT/lib/ -lasdmStandalone -lalmaEnumerations -lpthread

*/

Lazy filler. Principle and performances.

(last update -- MichelCaillat - 03 Dec 2012)

Before talking about the "lazy filler", let's recall that the filler is application which transforms an ASDM dataset produced e.g. by ALMA or the EVLA and transforms it into a CASA Measurement Set. It does not add any information, it simply reorganizes data formatted à la ASDM in a format readable by CASA applications. In that regard a side effect of using the filler is a duplication of the disk space used to contain basically the same information. If this duplication can be viewed as acceptable for the parts (tables) of the dataset which are not too big, it can be perceived as expensive in terms of storage and time spent in I/Os for the observational data (visibilities). Then comes the idea of using one powerful mechanism of the CASA tables software : the storage manager. The storage manager is this piece of software which allows to decouple the way data are actually organized in their storage from the way they are logically seen by the "high level" software and by the end user. Each collection of data ( basically columns of tables) has one storage manager attached to it; everytime a read or write access is done on one column its storage manager is invoked to perform the actual read or write operation on the underlying data.

The approach used so far by the ASDM to MS filler was to transform all the data including the visibilities from their original format to CASA tables by using storage managers already existing in CASA. This method can be qualified as "non lazy" in that sense that it transforms all the data contained in the ASDM dataset into data producing the measurement set.

Then comes the idea to use an adhoc storage manager allowing to view on demand or just in time the observational data of an ASDM dataset left in their original format (BDF) on the storage device but presented in the DATA column of a CASA measurement set. At run time this storage manager needs a set of informations ( where are they ? how are they stored ? how are they encoded ?) to retrieve on demand the data stored in the original format (ASDM/BDF) and to logically present them in the CASA measurement set DATA column; it is these informations that the lazy filler produces and stores into the measurement set directory in one file which is by far smaller and faster to produce than the result of a total conversion.

Tests.

In order to test an implementation of this idea realized by Ger van Diepen, Dirk Petry and myself , I considered a relatively large dataset produced by the ALMA radiotelescope occupying approximately 8Gb (303 BDF files) on disk. All the tests are performed on a MacOS X Lion 10.7.5 platform with 8Gb of RAM and a SATA disk TOSHIBA MK7559GSXF Media 5400rpm.

About TaQL

Let's remind here that a documention of the very powerful application TaQL developed by Ger van Diepen can be found here.

.
MacBook-Pro-de-Michel:SV-Mars michel$ du -sk ./uid___A002_X48b450_Xc2
8156872 ./uid___A002_X48b450_Xc2
MacBook-Pro-de-Michel:SV-Mars michel$
Comparing the times consumed by the executions of the non lazy and lazy fillers.

non lazy :
MacBook-Pro-de-Michel:SV-Mars michel$ time asdm2MS uid___A002_X48b450_Xc2
No transformation will be applied on this dataset.
real 11m5.406s
user 4m25.716s
sys 0m34.809s
MacBook-Pro-de-Michel:SV-Mars michel$

lazy:
MacBook-Pro-de-Michel:SV-Mars michel$ time asdm2MS --lazy uid___A002_X48b450_Xc2 uid___A002_X48b450_Xc2.lazy.ms
No transformation will be applied on this dataset.
real 4m31.107s
user 1m33.145s
sys 0m11.729s
MacBook-Pro-de-Michel:SV-Mars michel$
Verifying the results.

The first thing to do before making any other comparisons is to verify that the measurement set produced by the lazy filler is similar to the one produced by the non lazy one.
MacBook-Pro-de-Michel:SV-Mars michel$ taql 'select from [select from uid___A002_X48b450_Xc2.ms/ orderby TIME, DATA_DESC_ID, ANTENNA1, ANTENNA2 ] t1, [select from uid___A002_X48b450_Xc2.lazy.ms/ orderby TIME, DATA_DESC_ID, ANTENNA1, ANTENNA2 ] t2 where (not all(near(t1.DATA,t2.DATA, 1.e-06))) '
select result of 0 rows
MacBook-Pro-de-Michel:SV-Mars michel$

The result of this query confirms the (quasi) equality between the DATA columns of the two datasets, a relative tolerance of 1.e-06 was admitted in order take into account possible different rounding errors. One can notice that we took care of imposing a similar order on the two tables before proceeding to the comparison; this is necessary in particular because the order of baselines in the MS produced by the lazy filler (directly inherited from their order in the BDfs) is different from what it is in the MS produced by the non lazy filler.
Comparisation of the disk space consumptions.

non lazy :
MacBook-Pro-de-Michel:SV-Mars michel$ du -sk uid___A002_X48b450_Xc2.ms
16621500 uid___A002_X48b450_Xc2.ms
MacBook-Pro-de-Michel:SV-Mars michel$

One notices that :
  • the MS occupies roughly twice the space used by the ASDM
  • the total space occupied by the ASDM and the MS is equal to the sum of their respectives sizes i.e. approx 24 Gb
lazy :
MacBook-Pro-de-Michel:SV-Mars michel$ du -sk uid___A002_X48b450_Xc2.lazy.ms/
527532 uid___A002_X48b450_Xc2.lazy.ms/
MacBook-Pro-de-Michel:SV-Mars michel$ du -sk uid___A002_X48b450_Xc2/ASDMBinary/
8011184 uid___A002_X48b450_Xc2/ASDMBinary/
MacBook-Pro-de-Michel:SV-Mars michel$ 

In that case one must add the space occupied by the measurement set (i.e. the total size of all the files located in the MS directory and subdirectories) to the total size of the ASDM dataset's files containing the visibilities since the ASDM and the MS share these files. On notices easily that :
  • the disk space used by the lazily filled MS is very small compared with the one used by the MS filled with the non lazy method.
  • this disk space is quite neglectable compared with the total size of the files containing the visibilities which are, once more, shared with the ASDM dataset.
  • consequently the immediate benefit of using the lazy filler is a drastic saving of disk space ( ~8Gb vs. ~24Gb).
Comparisons of read performances.

So far the comparisons plead undoubtly in favour of using the lazy filler. Then the last comparison which can confirm the interest of the lazy filler or on the contrary invalidate it is the performance observed while reading the DATA columns of both measurements sets. What's the price of using the asdm storage manager (this is the name of the storage manager developed for the lazy filler) on the visibilities stored in their original format versus using one standard CASA storage manager on visibilities already reformatted for this storage manager ?

The comparisons are done with the taql tool which allows to browse very simply a measurement set and to make time measurements (the results are expressed in seconds). For a documentation about TaQL? you may want to read this note.
Retrieving shapes.

non lazy :
TaQL> time select shape(DATA) from uid___A002_X48b450_Xc2.ms/
Projection 20.07 real 19.3 user 0.57 system
Total time 20.1 real 19.31 user 0.57 system
using style python time select shape(DATA) from uid___A002_X48b450_Xc2.ms/
has been executed
select result of 2170102 rows

lazy :
TaQL> time select shape(DATA) from uid___A002_X48b450_Xc2.lazy.ms/
 Projection 20.39 real 19.23 user 0.57 system
 Total time 20.4 real 19.25 user 0.58 system
using style python time select shape(DATA) from uid___A002_X48b450_Xc2.lazy.ms/
 has been executed
 select result of 2170102 rows

Clearly as long as one is interested only in the shapes (not really useful I agree) the performances are very similar.
Working with the values. Computing the sums of values in each DATA cell.

non lazy :
MacBook-Pro-de-Michel:SV-Mars michel$ taql 'time select SUM(DATA) from uid___A002_X48b450_Xc2.ms'
Projection 494.16 real 177.65 user 12.77 system
Total time 494.35 real 177.66 user 12.8 system
select result of 2170102 rows

lazy :
MacBook-Pro-de-Michel:SV-Mars michel$ taql 'time select SUM(DATA) from uid___A002_X48b450_Xc2.lazy.ms'
Projection 348.35 real 159.33 user 11.37 system
Total time 352.43 real 159.51 user 11.46 system
select result of 2170102 rows

Suprisingly the results are without any discussion in favour of the lazy approach since it appears as 1.4 times faster than the non lazy one !
Working with values. Computing the phases.

non lazy :
MacBook-Pro-de-Michel:SV-Mars michel$ taql 'time select PHASE(DATA) from uid___A002_X48b450_Xc2.ms/'
Projection 680.29 real 302.15 user 27.01 system
Total time 680.52 real 302.16 user 27.03 system
select result of 2170102 rows
MacBook-Pro-de-Michel:SV-Mars michel$

lazy :
MacBook-Pro-de-Michel:SV-Mars michel$ taql 'time select PHASE(DATA) from uid___A002_X48b450_Xc2.lazy.ms/'
Projection 534.15 real 284.8 user 26.18 system
Total time 538.93 real 285 user 26.27 system
select result of 2170102 rows

Again the performances obtained with the measurement set filled "lazily" are better than with the other measurement set, approximately 1,26 times.

Parallel filler - Principle and performances.

(-- MichelCaillat - 25 Sep 2013)

Before talking about the parallel filler, let's recall that in our context of work a filler is an application which reads an ASDM dataset produced e.g. by ALMA or the EVLA and transforms it into a CASA Measurement Set. It does not add any information, it simply reorganizes data formatted à la ASDM in a format readable by CASA applications. It's also useful to remember that in an ASDM dataset data are organized in two different ways. Firstly the data which describe all the aspects the observation, sometimes called metadata, are logically organized in a set of tables like the tables of a plain SQL database and stored in a collection of XML documents whether on disk or in the Archive. Secondly the data produced by all the processors of the telescope (Correlator, Radiometer, Square Law Detector) are stored on disk or archived in a set of binary files. The format of these files is known as the Bulk Data Format. Of course there is an association between the set tables and the set of binary files so that the ASDM is a coherent set. More precisely among the different tables there is one, the Main table, which contains references to the BDFS of the dataset in its rows.

What do we want to parallelize ?

During one execution of the filler most of the (elapsed) time is spent to read the BDFs, to reformat their content and to write the result in the appropriate place in the MS simply because the number of bytes occupied by the BDFs is greater than the global size in bytes of metadata. As an example let's consider uid://A002/X48b450/Xc2 which contains roughly 50 minutes of observation with data produced by 22 antennas and their radiometers. Data are sampled at full (spectral) resolution and also by channel averaging. The number of channels can be up to 3820. The number of polarization is 2. In these conditions the BDFs occupy 8011184 Kb while the tabulated data (the metadata) occupy 145688 Kb. Logically the effort of parallelization will be put on the processing of the BDFs given that this part is roughly 54 times larger as the part containing the metadata and that the work on each part has roughly the same complexity.

What will the parallelization result into ?

The existing filler which works in a purely sequential way creates one measurement set whose Main table is filled row after row with the observational data coming from the BDFs after some rearrangement. A modification of the filler leading to paralellizing the filling of the Main table seems to be difficult, not to say impossible (at least with the storage managers that we know) and the choice of writing many measurement sets in parallel whose reunion would be equivalent the unique measurement set resulting from the execution of the sequential filler appears as the simplest (if not unique) solution.

How to partition the data for the parallelization ?

Now that we have decided to produce a set of measurements sets for one ASDM dataset, the next step is to decide of what will go in each measurement set. In other words how do we logically partition the ASDM dataset so that execution of the filler on each element of the partition, supposedly done in parallel, will produce one distinct measurement set ? The answer comes with the notion of configuration which is a part of the ASDM's design. The notion of configuration is based on the idea that the set of data produced during one observation is by nature partitionable. One configuration description collects all the informations describing one element of the partition i.e. a subset of the observation data, these informations are amongst others :
  • Which processor has produced the data ?
  • Which kind of spectral resolution is attached to these data, full resolution, channel average, baseband wide ?
  • What are the frequencies and the polarizations attached to these data ?
All the configuration are described and recorded in configuration descriptions which are stored in the ConfigDescription? table. Now what is important to understand is that the data contained in one BDF are described by one and only one configuration description, consequently it's very easy to partition the set of BDFs into a set of subsets where each element of one subset share the same configuration description. So, using the configuration descriptions we have reached a first level of partioning which can be used for the parallelization. For example in the case of uid://A002/X48b450/Xc2 we have 7 configuration descriptions :
  • 0 : (CORRELATOR, FULL_RESOLUTION, 22 antennas, 4 pairs numChan x numCorr ([('128', '2'), ('128', '2'), ('128', '2'), ('128', '2')]) )
  • 1 : (CORRELATOR, CHANNEL_AVERAGE, 22 antennas, 4 pairs numChan x numCorr ([('1', '2'), ('1', '2'), ('1', '2'), ('1', '2')]) )
  • 2 : (RADIOMETER, FULL_RESOLUTION, 22 antennas, 1 pair numChan x numCorr ([('4', '1')]) )
  • 3 : (CORRELATOR, FULL_RESOLUTION, 22 antennas, 4 pairs numChan x numCorr ([('128', '2'), ('128', '2'), ('128', '2'), ('128', '2')] )
  • 4 : (CORRELATOR, CHANNEL_AVERAGE, 22 antennas, 4 pairs numChan x numCorr ([('1', '2'), ('1', '2'), ('1', '2'), ('1', '2')]) )
  • 5 : (CORRELATOR, FULL_RESOLUTION, 22 antennas, 4 pairs numChan x numCorr ([('3840', '2'), ('3840', '2'), ('3840', '2'), ('3840', '2')] ) )
  • 6 : (CORRELATOR, CHANNEL_AVERAGE, 22 antennas, 4 pairs numChan x numCorr ([('1', '2'), ('1', '2'), ('1', '2'), ('1', '2')] ) )
A first idea is to parallelize by configuration description, i.e. in the case of the example above to create seven measurement sets out of uid://A002/X48b450/Xc2. Nonetheless the number of bytes to be processed per configuration description is logically very variable and parallelizing on the basis of one thread per configuration description may lead to a very unbalanced workload between threads. The case of uid://A002/X48b450/Xc2 is a good illustration of this disparity of number of bytes between configuration descriptions :
  • 0 : 25 962 410 bytes
  • 1 : 504 480 bytes
  • 2 : 766 839 bytes
  • 3 : 74 955 735 bytes
  • 4 : 1 752 581 bytes
  • 5 : 8 083 622 224 bytes (!!!)
  • 6 : 15 508 234 bytes
Obviously most of the time would be consumed in the thread in charge of the configuration description #5 and the parallelization would not give a visible improvement of performances.Thus we must to look for another "axis" of parallelization. Even if we continue to consider the partition based on the configuration description, we take into account the fact that inside one configuration description data can be partitioned again per data description per pair (spectral window, polarization). More importantly even if it's not guaranted in all cases, the amount of bytes to process per data description is more evenly distributed. Our sample dataset is an example of a situation where the balance is perfect :
  • 0 : 4 pairs numChan x numCorr ([('128', '2'), ('128', '2'), ('128', '2'), ('128', '2')]) )
  • 1 : 4 pairs numChan x numCorr ([('1', '2'), ('1', '2'), ('1', '2'), ('1', '2')]) )
  • 2 : 1 pair numChan x numCorr ([('4', '1')]) )
  • 3 : 4 pairs numChan x numCorr ([('128', '2'), ('128', '2'), ('128', '2'), ('128', '2')] )
  • 4 : 4 pairs numChan x numCorr ([('1', '2'), ('1', '2'), ('1', '2'), ('1', '2')]) )
  • 5 : 4 pairs numChan x numCorr ([('3840', '2'), ('3840', '2'), ('3840', '2'), ('3840', '2')] ) )
  • 6 : 4 pairs numChan x numCorr ([('1', '2'), ('1', '2'), ('1', '2'), ('1', '2')] ) )
Hence the idea of parallelizing the production of MSs containing the data coming from one same BDF but associated with different data descriptions.

Process the configuration description sequentially, read the BDFs sequentially for each configuration description and write in parallel the data per data description.

The algorithm

The algorithm is as follows :
for each configuration description cfgId defined for the observation
 consider the sequence s_cfgId of BDFs having cfgId as their configuration description and ordered by ascending time
 for each BDF bdf of s_cfgId 
 read bdf
 for each data description dd of cfgId  <------ The steps of this loop are performed in parallel threads.
 identify the data contained in bdf and associated with dd, process them and write them into their measurement set </span>
The test.

We have tested a preliminary implementation of the algorithm described above. Instead of trying to write real CASA measurement sets, we preferred to observe the effect of the parallelization when the processed data are output in plain binary files. The parallelization is done on shared memory multi proc/core machines with OpenMP directives.

The tests are performed on three different platforms :
  • An iMac under MacOS X 10.6.8 , one core i7 / 4 cores with 16Gb of RAM and one 1Tb SATA disk
  • A node of momentum.obspm.fr ( 24 nodes of (4 proc per node/ 16 cores per proc) + FHGFS file system)
  • A node of the cluster at the ARC located at ESO Garching ( Lustre file system)
The results.

We ran two series of tests, one which performs the full suite of operations (read, process) and the other one which performs only (read, process), option --dry-run. We wanted to see how the concurrent write performed in paralled by different threads would impact the relative performance parallel vs. sequential.
Full suite, read-process-write.

The command :
time parasdm2MS -v --parallel --no-pointing --no-caldev ./uid___A002_X48b450_Xc2
  with --parallel without --parallel
arcp2.eso.org (ESO) real 15m31s
user 50m 56s
sys 3m36s
real 49m42s
user 45m6s
sys 3m
momentum.obspm.fr
(LERMA)
real 18m23s
user 1h07s
1mn 42s
%CPU 381%
real 1h07m
user 1h03m
???
%CPU 96%
volte.obspm.fr
(Michel's iMac)
real 30m57s
user 67m41s
sys 4m24s
real 1h11m
user 1h01m
sys 1m49

Comments :

  • The ratios of elapsed time parallel over elapsed time sequential are 3.2, 3.64 and 2.29 respectively for the ARC cluster, the LERMA cluster and the iMac. To be meaningful these values must be presented along with the number of threads created for most of the configurations which is equal to 4. One sees that for the clusters machines the ratios values clearly above 3 proove the interest of the parallelization even if the theoretical optimum (4 - epsilon) is not reached; at least one may think that the filesystems endure correctly the parallel writing.
  • The performances of the two clusters are relatively comparable with a slight advantage to the ESO machine though. In any case their different filesystems seem to be good competitors.
  • The performances of the iMac compared with the (expensive) ones obtained on the clusters are not so bad (2 times slower than the ESO cluster, 1.7 times slower than the LERMA cluster).
  • On the other hand the gain of performance due to the parallelization of the iMac, 2.29, is rather poor in a context of four running threads. One is tempted to explain this by the concurrent writes which the iMac cannot perform so easily than the clusters with their high performances file systems.
Incomplete suite, read-process-nowrite.

The command :

time parasdm2MS -v --dry-run --no-pointing --no-caldev ./uid___A002_X48b450_Xc2te</span>
  with --parallel without --parallel
arcp2.eso.org (ESO) T.B.D T.B.D
momentum.obspm.fr
(LERMA)
T.B.D T.B.D.
volte.obspm.fr
Michel's iMac
T.B.D. T.B.D.

Filling orders of the MS Main table - Lazy filler vs. plain filler

This memo describes how the rows written in the MS Main table by the filler are ordered depending on the flavour, lazy or plain, utilized to generate the measurement set.

Both filling policies, lazy or plain, process sequentially a collection of so called subscans. A subscan contains a subset of the data recorded during the full observation (or Execution Block) by a given processor. It's characterized mostly by the processor which produced the data, the configuration in use (e.g. antennas, spectral and polarization specifications) and of course a time range (its start time and its duration).

What we describe now is the order in which the MS Main table rows are populated while one given subscan is traversed by the filler. The order between two rows of the MS Main table coming from data belonging to two different subscans will be described after.

Filling order for data belonging to a same subscan.

A subscan is an ordered sequence of integrations ordered by ascending time.

A subscan is itself a sequence of so called "integrations" (the term subintegration is used when channel averaging is done but this change nothing to the reasonment) and this sequence is ordered by increasing time. Each integration contains all the data recorded by the processor and the configuration which characterize the subscan. These data are splitted into separate parts depending on their nature (cross data, auto data, flags, zerolags)
Structure of an integration.

In the current implementation of the software, the entities which define the structure of the content of one integration are :
  • the antenna considered wether individually or as the extremities of one baseline.
  • the data descriptons i.e. the pairs of (spectral window, polarization) configurations.
  • the frequencies.
  • the polarizations.
whatever is the processor (correlator, radiometer) the data are ordered as follows :
  • ANTENNA - DATA DESCRIPTION - FREQUENCY - POLARIZATION for the autocorrelation data (if any) .
  • BASELINE - DATA DESCRIPTION - FREQUENCY - POLARIZATION for the cross correlation data (if any).
Antennas ordering.

The antennas for the acquisition of the data are listed in the entry of the ASDM Configuration table which describes the configuration in use during the subscan execution. This list must be considered as an ordered list.
Baselines ordering.

Given the ordered list of antennas found in the ConfigurationDescription table say (A1, A2, ...., An), the sequence of baselines is ordered as follows :

(A1, A2), (A1, A3), (A2, A3), (A1, A4), (A2, A4), (A3, A4) ...

in other words the index of the 1st antenna varyies faster than the one of the second antenna. Let's call it BLORDER1.

On may notice that BLORDER1 is the transposed of BLORDER2 (the index of the second antenna varyies faster) :

(A1, A2), (A1, A3), (A1, A4) .... (A1, An), (A2, A3), (A2, A4),...,(A2, An),...

which is appears more often.
Data ordering in one subscan (BDF).

Grouping the orderings described above one obtain :
  • TIME - ANTENNA - DATA DESCRIPTION - FREQUENCY - POLARIZATION for the autocorrelation data (if any) .
  • TIME - BASELINE - DATA DESCRIPTION - FREQUENCY - POLARIZATION for the cross correlation data (if any).
What does the filler do exactly ?
The filler has a lot of transpositions to do.
Lazy or plain, the filler has a lot of transpositions to perform when it fills the MS Main table since for performances reasons the time must vary faster than the data description. In other words a preferred ordering in the MS is something like :

DATA_DESCRIPTION - TIME - ANTENNA followed by BASELINE - FREQUENCY - POLARIZATION

Other combinations are possible, but the imperative condition was that DATA_DESCRIPTION varyies the last.
The plain filler ordering.

The MS Main table rows resulting from the data contained in one same BDF (subscan) are ordered as follows :

DATA_DESCRIPTION - (TIME - ANTENNA - FREQUENCY - POLARIZATION) followed by (TIME - BASELINE - FREQUENCY - POLARIZATION) with baselines ordered with BLORDER2.
The lazy filler ordering.

The MS Main table rows resulting from the data contained in one same BDF (subscan) are ordered as follows :

DATA_DESCRIPTION - TIME - ( ANTENNA - FREQUENCY - POLARIZATION) followed by ( BASELINE - FREQUENCY - POLARIZATION ) with baselines ordered with BLORDER1.

So obviously the two flavours of the filler do not order the rows produced by a same subscan in the same way. The two differences are :
  • The baselines are ordered diffently.
  • The AUTO or CROSS data are interleaved differently.

How to check the build server activity. (News from Jenkins)

https://casa-jenkins.nrao.edu/job/casa-active/

How do I get disk usage of a directory excluding the subdirectories.

I needed this in order to measure the space occupied on disk by an ASDM excluding the binaries data (i.e. the ASDMBinary subdirectory). Firstly cdto the directory of interest and then :

ls -lF | grep [^/]$ | awk 'BEGIN {s=0} {s=s+$5; print $5 " " $9 } END {print s / 1024 / 1024 }'

Have you lost your password on a Mac OS X Lion machine ?

http://www.macyourself.com/2011/08/20/how-to-reset-password-for-mac-os-x-10-7-lion/

How do I open a port ?

On Scientific Linux (and other avatars of Red Hat I think)

firewall-cmd --zone=public --add-port=2888/tcp --permanent
firewall-cmd --reload

The example above opens permanently the port 2888 in the zone public (1st command) and makes the change effective (2st command).

Working with Oracle.

How to delete all the ASDM tables in the Oracle SQL database ?

select 'delete '||table_name||';' from user_tables where table_name LIKE 'ASDM_%' ; 

How to open the port 1521 (Oracle listener) ?

sudo iptables -I INPUT -p tcp --dport 1521 --syn -j ACCEPT
sudo service iptables save 

How to know all the changes of one file in SVN. -- MichelCaillat - 15 Nov 2016

Use the smart bash script below. Credits : http://stackoverflow.com/questions/282802/how-can-i-view-all-historical-changes-to-a-file-in-svn/283168#283168

# history_of_file
#
# Outputs the full history of a given file as a sequence of
# logentry/diff pairs.  The first revision of the file is emitted as
# full text since there's not previous version to compare it to.

function history_of_file() {
    url=$1 # current url of file
    svn log -q $url | grep -E -e "^r[[:digit:]]+" -o | cut -c2- | sort -n | {
#       first revision as full text
        echo
        read r
        svn log -r$r $url@HEAD
        svn cat -r$r $url@HEAD
        echo
#       remaining revisions as differences to previous revision
        while read r
        do
            echo
            svn log -r$r $url@HEAD
            svn diff -c$r $url@HEAD
            echo
        done
    }
}

history_of_file $1</span>

Summary of changes.

CasaActive35

Available software.

This section tries to make an inventory of the SoftwareToolsRelatedWithTheASDM.

Available documents.


WorkingForALMAAndEVLA Web Utilities

  • bdf.pdf: The ASDM Binary Data Reference document (BDF).
Topic attachments
I Attachment Action Size Date Who Comment
Clavier-Proprie769te769sDuClavier-Agencements-SL.gifgif Clavier-Proprie769te769sDuClavier-Agencements-SL.gif manage 32.9 K 25 Jan 2013 - 16:56 MichelCaillat The keyboard configuration in SL 5.x on a Mac in a VMware Fusion VM
KM.1.gifgif KM.1.gif manage 39.8 K 14 Feb 2012 - 15:23 MichelCaillat VMWare Fusion - Préférences - Clavier et souris - Mappages de touche
KM.2.gifgif KM.2.gif manage 33.1 K 14 Feb 2012 - 15:29 MichelCaillat VMWare Fusion - Préférences - Clavier et souris - Raccourcis souris
KM.3.gifgif KM.3.gif manage 30.2 K 14 Feb 2012 - 15:33 MichelCaillat VMWare Fusion - Préférences - Clavier et souris - Raccourcis Mac OS
Topic revision: r62 - 15 Nov 2016, MichelCaillat
 

This site is powered by FoswikiCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback