This guide provides a basic example for calibrating e-MERLIN C-Band (5 GHz) continuum data in CASA. It was re-written in 2019 using CASA 5.6. It is designed to run using an averaged data set small_avg.ms, in which case you will need 6-10 G disk space to complete this tutorial (if you have <10 G you will need to delete some intermediate steps or move them to storage).
3C277.1 is a twin-lobed radio galaxy. These C-band (4-7.3 GHz) e-MERLIN observations were made specifically for this tutorial but previous MERLIN and VLA images have been published by Ludke et al. 1998 and Cotton et al. 2006 .
The CASA home pages describe downloading and installing CASA; also see Getting Started in CASA .
Download the files you need for this tutorial in advance (or, at ERIS, get it from a USB drive):
wget -c http://almanas.jb.man.ac.uk/amsr/3C277p1/2019/small_avg.ms.tar wget -c http://almanas.jb.man.ac.uk/amsr/3C277p1/2019/1331+3030_2019C_I_ap1.tar wget -c http://almanas.jb.man.ac.uk/amsr/3C277p1/2019/3C277.1_cal_script_flags.tarYou can arrange the work as you like as long as you have enough space (don't do data reduction in the CASA installation). I used the following directory structure:
> 3C277.1/ # parent directory with subdirectories: | calibration/ # for the work described here | analysis/ # for total intensity imaging and self-calibration | polarization/ # for polarization calibration and imagingso I placed these tar files in the calibration directory. Extract each using
tar -xvf small_avg.ms.taretc. This will give you
small_avg.ms # Input visibility data 3C277.1_cal_outline.py # Script to fill in small_avg_cals1_editme.flags # List of flag commands to complete small_avg_cals1.flags.gz # Example complete list 3C277.1_cal_all.py.gz # Full script 1331+3030_ap1.clean.image.tt0 # For info: 3C286 image 1331+3030_ap1.clean.model.tt0 # 3C286 model 1331+3030_ap1.clean.model.tt1 # 3C286 model for spectral dependenceDon't gunzip the complete flag and script files unless you are really stuck!
The main purpose of this User Guide is to help you produce a script which can be executed one step at a time, to inspect, flag and calibrate 3C277.1. Each stage is performed by executing CASA tasks. Copy 3C277.1_cal_outline.py to the name of your choice and open your script in a text editor, ideally one like emacs which understands python syntax, and read the introduction. Some of the parameters' values are used more than once and these are entered as variables just after the list of steps. Others, you will enter when you come to the task by replacing '**'. Note that in some cases values are entered as 'strings' in inverted commas, but in other cases '**' is just to allow the placeholder; variables, strings and logical True or False do not need inverted commas - i.e. if you get an error try removing them.
It is also possible to enter task parameters at the command line. Whatever method you use, you are encouraged enter e.g. inp gaincal or help gaincal in the CASA terminal to examine your options and see how to check for syntax errors. You can run the task directly by typing gaincal at the prompt; if you want to change parameters and run again, if the task has written a calibration table, delete it (make sure you get the right one) before re-running. You can review these by typing:
# In CASA: !more gaincal.last
taskname = "gaincal" vis = "all_avg.ms" caltable = "bpcals_precal.ap1" ... ... #gaincal(vis="all_avg.ms",caltable="bpcals_precal.ap1",field="1407+284",spw="",intent="", selectdata=True,timerange="",uvrange="",antenna="",scan="",observation="",msselect="",solint="120s", combine="",preavg=-1.0,refant="Mk2",minblperant=2,minsnr=2,solnorm=True,gaintype="G",smodel=[], calmode="a",append=True,splinetime=3600.0,npointaver=3,phasewrap=180.0,docallib=False,callib="", gaintable=['all_avg.K', 'bpcals_precal.p1'],gainfield=[''],interp=[],spwmap=[],parang=False)
When you are happy with the values which you have set for each parameter, enter them in the script 3C277.1_cal_outline.py (or whatever you have called it). You can leave out parameters for which you have not set values; the defaults will work for these (see e.g. default gaincal for what these are), but feel free to experiment if time allows.
You do not need to make disk copies of plots, but this is included in the full script. Some of these plots are linked below so you can compare what you get.
The imput data set all_avg.ms contains five fields:
e-MERLIN name | Common name | Role |
---|---|---|
1252+5634 | 3C277.1 | target |
1302+5748 | J1302+5748 | phase reference source |
1407+2827 | OQ208 | Spare bandpass cal. (usually about 2 Jy at 5-6 GHz |
0319+4130 | 3C84 | bandpass, polarization leakage cal. (usually 10-30 Jy at 5-6 GHz) |
1331+3030 | 3C286 | flux, polarization angle cal. (flux density known accurately, see step 10) |
These data were taken in full polarization in 0.512 GHz bandwidth centred on 5072 MHz, in 4 adjacent spectral windows (spw's, also known as IFs). Each 128-MHz spw contains 512 channels 0.25 MHz wide and 1 s integrations were used (the data are later averaged). The e-MERLIN User Support web page describes e-MERLIN data in more detail, including the Cookbook and the CASA pipeline The pipeline was used for initial steps, including applying flags for when the telescopes were not on-source, ensuring correct uvw coordinates, averaging in time (to 4 s integrations) and frequency (to 2-MHz channels), ready for calibration.
The name of the MS is set as variable thisvis in the script and here (to make it easier to adapt this script for other data). Note no final commas if entering parameter values at the command line.
#In CASA: thisvis='small_avg.ms' os.system('rm -rf '+thisvis+'.listobs.txt') # remove previous output, if any default listobs vis=thisvis # measurement set to list listfile=thisvis+'.listobs.txt' # output text file inp listobs # check inputs listobs # run task
You can enter the commands manually to run listobs. When it has finished, in CASA when working at the command line you can prefix any shell command with ! e.g.
#In CASA: !ls *listobs.txt # should show the file written !gedit small_avg.ms.listobs.txt # or your favourite text viewer
These commands are equivalent to part of script step 1 (with commas):
os.system('rm -rf '+thisvis+'.listobs.txt') listobs(vis=thisvis, overwrite=True, listfile=thisvis+'.listobs.txt')
The next parts of step 1 plot the antenna positions and the uv tracks for the phase-reference source in plot windows. You will need to edit the script: use help plotms to decide what values to enter for the x and y axes instead of **.
# Plot visibility tracks in plotms plotms(vis=thisvis, xaxis='**', yaxis='**', # plot u against v coloraxis='spw',avgchannel='16', field=phref)
You can run all of step 1 using the script:
#In CASA: mysteps=[1] execfile('3C277.1_cal_mine.py') # or whatever you have called the script
You can see from the plot of e-MERLIN antenna positions that Mk2 has a lot of short baselines, which are easier to calibrate, and this is usually chosen as the reference antenna, set as variable antref in the script.
The uv tracks of the phase-reference source show that although the broad band improves the uv coverage, the missing spacings will produce sidelobes which we will deconvolve in cleaning.
Look at script step 2 and enter the appropriate variables instead of asterisks. Run the step. The lower sensitivity of edge channels is due to the bandwidth selection process and can be assumed to be the same for all antennas, so you can just look at one baseline. Step 2 opens the plotms window showing spw 0 and you can scroll through the spw interactively. For each spw, note the range of edge channels which fall below the sensitivity of the rest of the channels and enter these values in step 3, flagdata.
When you have filled in the parameters, run step 3 and iterate through the spw to check. The spw should look something like this If you have flagged too much data, you can re-run flagmanager to restore the saved state - you can enter each parameter on a separate line or you can copy and paste the whole script-style command.
# Only if you have flagged incorrectly! #In CASA: flagmanager(vis=thisvis, mode='restore', versionname='pre_endchanflag')
This follows a similar principle but starting with plotting amplitude against time. These errors may be different for each antenna. Copy small_avg_cals1_editme.flags to small_avg_cals1.flags so you can edit it. Then enter the parameter values and run step 4, and record the flags you want to add in small_avg_cals1.flags. As you can see, step 5 starts with another flag backup. Once you have entered the values, run step 5 and compare the plotms output such as 0319+4130 Mk2 & Kn before flagging and after. Use flagmanager to restore if something goes wrong.
Step 6 will plot the raw phase against frequency for a bright source. This is usually a bandpass calibrator. Set a time averaging interval in the script and run step 6. You can page interactively through the baselines to the refant. The lines for each of RR and LL represent separate scans on 0319+4130. The exact appearance will depend on the averaging interval chosen.
Uncorrected phase v. frequency per scan on Mk2 & Kn
The sign convention for e-MERLIN is such that the delay is negative if the phase is increasing with respect to frequency and positive if it is decreasing.
The y-axis spans -200 to +200 deg and some data has a slope of about a full 360 deg across ~100-MHz bandwidth.
QUESTION 1 What is the apparent delay error this corresponds to? What will be the effect on the amplitudes? Answer 1
QUESTION 2Do you expect the delay corrections to be of the same sign for all times? Paging through the baselines, which antenna has the delay error of largest magntiude? Answer 2
Next, enter the missing values in steps 7 and 8 and run them separately. Examine the
plot each time:
Step 7, a single channel
Step 8, all channels averaged
QUESTION 3 In the plots, which has the higher amplitudes - the channel-averaged data or the single channel? Why? Answer 3
The CASA task gaincal is used to derive most calibration. Type inp gaincal or help gaincal to find out more. This assumes that all the calibrators are point sources and compares their phases as a function of frequency with the flat zero phase expected. If the delay looks stable within each scan, you can set the solution interval to a scan length. Fill in the ** for script step 9 and run it. You may get a lot of messages about 'no unflagged data' which is generally just where indeed data are flagged, and nothing to worry about unless the solutions plot shows unexpected gaps.
This also plots the resulting Delay correction table for all calibration sources. The solutions are zero for the refant as the other phases are calculated relative to this.
1331+3030 (= 3C286) is an almost non-variable radio galaxy and its total flux density is very well known as a function of time and frequency (Perley & Butler 2013) . It is somewhat resolved by e-MERLIN, so we use a model made from a ~12 hr observation cycling between 0.5 GHz bands in the range 4.5 to 7.5 GHz. The model is two sets of clean components (see Imaging tutorial) and ft enters these in the Measurement Set so that their Fourier transform can be used as a uv-plane model to compare later with the actual visibility amplitudes. These models take into account the spectral slope of 1331+3030; note that the e-MERLIN pipeline uses a single-frequency model and a slightly different method.
0319+4130 (= 3C84) is a very bright, compact QSO which is highly variable but (at this resolution, at cm wavelengths) unpolarized. Its flux density and spectral index can be determined from 1331+3030, but here we use values previously derived for the same date, 31.4 Jy (uncertainty 0.2 Jy) at 5.06998 GHz with a spectral index of 0.45 (uncertainty 0.15).
Enter the required parameters and run step 10. If you make a mistake (e.g. set a model for the wrong source) use task delmod> to remove it. The Fourier plane models of 1331+3030 and 0319+4130, amp v. uv distance show both the brighter and weaker calbrators. What do the plots tell you about the structures of the two sources?
We have a Catch 22 situation: in order to derive good time-dependent corrections we need to average amplitude as well as phase in frequency, but in order to average in frequency we need the phase to be flat in time and the amplitude to have the correct spectral index. So we derive initial time-dependent solutions for the bandpass calibration sources, but discard these after they have been applied to derive the initial bandpass solutions. These, in turn, are replaced in script step 11.
Estimate the time-averaging interval for phase by plotting phase against time for a single channel near the centre of the band for a bandpass calibration source. If the results for the first channel you try look bad, it may be a partly flagged channel; try a different one. Enter the values and run step 11.
Zoom to help decide what phase solution interval to use and enter this as variable solintp1 (just under the list of steps at the start of the script). The integration-to-integration scatter is noise (mostly instrumental) which cannot be removed but the longer-term wiggles are mainly due to the atmosphere and we want to derive corrections to reduce these to about the level of the noise or at least to less than 5-10 deg. The longer a solution interval, the better the S/N but the interval has to be short enough not to average over the atmospheric wiggles. About 5 degrees accuracy is OK, you cannot remove thermal noise. Zoom in on plot of 0319+415 uncorrected phase v. time
QUESTION 4 The plot shows 4 spw. What other effect, apart from atmospheric phase errors, will be solved by calibrating phase against a point model at the phase centre? Answer 4
Next, enter the variables in step 12. This compares the phases of 0319+4130, as a function of time, against a point model. Applying the previous delay corrections means that averaging over all channels is fairly accurate. The solutions will be plotted in the CASA plotter. Check the solutions. Each spw should have a coherent set of solutions, not just noise, you may have to zoom in to see this. The jumps on Mk2 are where it was flagged and a different refant was used. By default this will be the next antenna in the array with data (i.e. Kn), but you can force it by giving a list in order of preference, e.g. refant='Mk2,Pi', and see also step 16. Initial phase solutions for the bandpass calibrator Note that although the pattern of solutions is hard to see in the png, you can zoom in plotcal to see that it follows the atmospheric phase rate smoothly. Also, note that the refant phase jumps, showing that Mk2 had some bad data and an alternative antenna was used. This does not matter here as these solutions are only used in averaging over all times prior to bandpass calibration, but see comments on refant in Step 16.Now derive amplitude solutions in step 13. Fill in the ** and also set the solution interval variable solinta. The plot of amplitude against time in Steps 7 and 8 showed that the uncorrected amplitudes have only a slow systematic slope with time. Applying the previous delay and phase solutions allows a longer solution interval to be used, minimising phase noise.
QUESTION 5 Why solve for phase separately before solving for amplitude? Answer 5
If we not yet derived flux densities for the bandpass calibration sources we would normalise the gains (i.e. individual solutions differ but the product is unity so that the overall flux scale is not affected). Since, here, we have set the flux of 0319+4130, we set solnorm=False and obtain solutions which, when divided into the data, will correct it to the value of the model. Check the solutions which are plotted, which should track smoothly for each spw - any wild points mean either bad data or bad phase or delay solutions. Initial amplitude solutions for bandpass calibrators
Fill in the ** and run step 14 to use bandpass to derive solutions for phase and amplitude as a function of frequency (i.e., per channel). In order to allow averaging in time, apply the time-dependent corrections which you have just derived.You may see messages about failed solutions but these should all be for the end channels which were flagged anyway.
Inspect the solutions produced by plotcal. The phase corrections are small as the delay solutions were already applied. The amplitude solutions will remove remaining wiggles from the bandpass. The
Phase bandpass solutions B1, coloured by spw
Step 15 also applies the delay and bandpass solutions to the calibration sources as a test, to check that they have the desired effect of producing a flat bandpass for the phase-reference. There is no need to apply the time-dependent calibration as we have not derived this for the phase-reference source anyway. Some notes about applycal:
help applycal
).The delay correction ('K') table contains entries for all sources except the target. The phase reference scans bracket the target scans so it is OK to use the default 'linear' interpolation.
The bandpass ('B1') table is derived from a few bandpass calibration observation and there may be data for other sources outside these times, so extrapolation in time is needed, using 'nearest'. 'linear' interpolation is OK for the bandpass calibrator in frequency. Run step 15.
Plotms is used to plot the phase ref phase before bandpass correction. In the plotms window, axes tab, change the ydatacolumn to 'corrected', tick 'reload' and plot again. Then change the yaxis to 'amp' and repeat for 'data' and 'corrected'.
Examples of the phase ref phase and amplitude before and after applying bandpass and delay corrections, coloured by scan:
phase, before and
phase, after
amp, before and
amp, after
In step 16, gaincal is used to derive time-dependent phase solutions for all the calibration sources, applying the delay and bandpass tables. First, in case you are repeating steps, models for the calibrators other than the flux scale are reset to the default. Replace the **, check that variable solintp1 is set to a suitable interval and run step 16.
When you run gaincal, check the messages in the logger and the terminal.
The logger shows something like:
2019-09-16 17:57:15 INFO gaincal Calibration solve statistics per spw: (expected/attempted/succeeded): 2019-09-16 17:57:15 INFO gaincal Spw 0: 1136/974/974 2019-09-16 17:57:15 INFO gaincal Spw 1: 1136/974/974 2019-09-16 17:57:15 INFO gaincal Spw 2: 1136/974/974 2019-09-16 17:57:15 INFO gaincal Spw 3: 1136/974/974
Check the phase solution plots. The phase will change quite rapidly with time but if you zoom it there should be a consistent trend, not just random noise. Now, all Mk2 solutions are zero because in gaincal refantmode='strict' forces the phase to be re-referenced as if Mk2 was zero even if it has a bad solint. This is necessary for accurate polarization calibration.
In step 17, gaincal is used to derive time-dependent amplitude solutions for calibration sources, applying the delay and bandpass tables and the time-dependent phase solutions you just derived. Replace the ** and run step 17. This time, we set solnorm=False as, next, the solutions for the phase reference will be compared with those for 1331+3030 to deduce the scaling factor.
Check the amplitude solution plots. At this point, 1331+3030 has a realistic model set in the MS but the phase ref model is the default 1 Jy point source at the phase centre. Thus, the different fields have different gain solutions, but for each source they should be fairly flat.
The solutions for 1331+3030 in calsources.ap1 contain the correct scaling factor to convert the raw units to Jy as well as removing time-dependent errors. This is used to calculate the scaling factor for the phase reference using fluxscale. The gainthreshold of 0.7 excludes solutions more than 70% from the mean. This method assumes that all sources without explicitly-entered models are points, so it cannot be used for an extended target. We run this task in a different way because the calculated fluxes are returned in the form of a python dictionary. The script uses the variable calfluxes for this. They are also written to calfluxes.txt. After running step 18, inspecting calfluxes.txt should show something like (shortened):
Flux density for 1302+5748 in SpW=0 (freq=4.88e+09 Hz) is: 0.470313 +/- 0.0171981 (SNR = 27.3468, N = 11) Flux density for 1302+5748 in SpW=1 (freq=5.008e+09 Hz) is: 0.466131 +/- 0.0147343 (SNR = 31.6357, N = 11) .... # Fitted spectrum for 1302+5748 with fitorder=1: Flux density = 0.468 +/- 0.001 (freq=5.06998 GHz) spidx: =0.13 +/- 0.07
Typing calfluxes will show the same information as a python dictionary and this is used as an input to etjy to set the flux scale for the phase reference. This requires fluxscale and setjy to be run without exiting CASA in between. The model for the phase ref is plotted in plotms, each line is 16 channels averaged, coloured by spw.
If we had not known in advance the flux density and spectral index of the phase refernece, for the highest accuracy you might repeat step 14 with the more accurate model, and apply this new bandpass table in later steps. We don't need to do this.
For each calibration source, there is now a flux density and spectral index set for the MS Model column, as well as the delay, phase and bandpass calibration. These are used to derive replacement time-dependent amplitude solutions which will not only correct for short-term errors but also contain an accurate flux scale factor for all sources. Compare the previous time that you ran gaincal, check that solinta1 is a suitable value in the first, amplitude, gaincal in step 19 and identify the other parameter which you need to change now (apart from the output table name).
In order to get the best amplitude solutions for the calibrators, we used the shortest phase solutions giving good SNR. However, in order to interpolate the phase reference solutions to the target, a solution interval equivalent to the whole or half of each phase reference scan will provide the best approximation. Enter a suitable solint in the second, phase gaincal in step 19, and the other parameters required.
Run step 19 and check the plotcal output final amplitude calibration solution plot. Now that all the calibrators have models, the solutions should be approximately the same for all sources.
The phase solutions to be applied to the target should be similar to the table generated in step 16, but smoother.
Step 20 starts by backing up the flagging state, as it is possible to allow applycal to flag data. We will do this here, as very few solutions failed, so it is simplest to let them be flagged rather than investigating further.
The first applycal applied the bandpass correction to all sources. For the other gaintables, for each calibrator (field
), the solutions for the same source (gainfield
) are applied. Gainfield '' for the bandpass table means apply solutions from all fields in that gaintable. applymode='calflag'
will flag data with failed solutions.
The second applycal applies the phase-ref solutions to the target. We assume that the phase reference scans bracket the target scans so linear interpolation is used except for bandpass. Enter values for '**' and run step 20, which finishes by plotting corrected phase ref visibility amplitudes against uv distance.If there is just a little bad data, no need to worry if the target is bright enough to self-calibrate (which ours is).
QUESTION 6 What do the plots of amplitude against uv distance tell you? Answer 6
Step 21 splits out the target and makes a listing. Replace '**' and run step 21. Compare the list file with the one for the whole data set in step 1 to check.