Published using Google Docs
EPICS Core Developers October 2017
Updated automatically every 5 minutes

EPICS Core Developers Workshop

Earlier                                                                                Later

Dates

        Oct 3rd - Oct 5th, 2017

Location

        ITER Headquarter, Bldg. & room TBD

Daily Timetable, Tuesday-Thursday

09:00 - 10:30        First session

10:30 - 11:00        Morning Break

11:00 - 12:30        Second session

12:30 - 13:30        Lunch Break

13:30 - 15:00        Third session

15:00 - 15:30        Afternoon Break

15:30 - 17:00        Fourth session

Parallel Meetings

Control System Studio (Weds + Thurs)

AreaDetector (Thursday)

Agenda

Items to be discussed (please add to this!)

Minutes

Tuesday, Oct 3rd

Present: RL, AJ, KS, (MD), TK, Peter Heesterman, Claudio Rosati, Miha Vitorovic

CAJ Topics

KS: CAJ history needs to be migrated into JCA, then add DiiRT. Need to figure out how to release into the central repo; knowing where to hit the button etc.

RL: Can help, try doing a pseudo-release of something to find out what permissions might be missing.

MD: Want release with my name-lookup-over-tcp code in it, but there is an issue with that code (blocking socket timeouts)

KS: Matej implemented a very different design pattern than I’m familiar with (leader-follower). Who will learn that design pattern. I’m still dealing with concurrency issues in events, I’m scared of touching the code.

MD: This is the majority use case at the moment.

RL: And the archiver.

KS: Yes, I have an archiver problem which is also a threading issue.

MD: Sounds like this code doesn’t fail well.

MD: Leader-follower is a way of handling TCP sockets in a low-latency way (vs reactor pattern). L-f uses a rotation of threads calling select/poll() which starts servicing when it sees data, and kicks off another thread that does the select/poll() to search for the next data.

CR: Seems similar to the Java thread pool, exactly the same model.

KS: Yes, JCA implements exactly this. Any change requires significant amount of work. Making small changes is scary, we should invest effort to truly understand the code-base before making changes.

RL: And have tests

KS: Matej moved on to a different CA library, newer and more readable, uses Java concurrency model. Could that be merged in? He didn’t think so.

MD: No call to merge, that was a replacement. First step might be to evaluate that.

KS: My stack doesn’t speak to JCA directly, so would be easy to replace.

TK: pvAccess, caj/jca, plus the third one. Not many users of the new one; JCA must continue to be supported because of existing users. ESS OpenXAL has been rewritten to use pvAccess.

KS: Will we do any big developments on top of JCA in the future, or should we switch to the new thing?

AJ: Need to evaluate the new thing (and find out its name).

KS: JCAE (https://github.com/paulscherrerinstitute/jcae), being actively maintained by PSI and CosyLab. I saw a presentation, Modern Java-8 API. No idea about performance or stress testing. This is what Matej now recommends for future use.

AJ: How to evaluate it, and who should do that? Would the Java-8 improve performance?

KS: No guaranteed, just makes it read better/easier for J8 dev’s. Compiler might be able to optimize better, but don’t know. Would need concrete tests to find out.

MD: Looking at the new CA API, how well does it deal with cancellation of asynchronous operations? E.g. requesting a Put on a currently disconnected PV.

TK: If we were to switch, who would maintain the old code?

KS: We could deprecate without having to throw out. We’re talking about a new connection layer, which should we develop on top of?

AJ: CAJ or JCA (changes names in minutes above to JCA)

KS: We will have both namespaces in the code, but the package name is JCA.

TK: We should take a good look at JCAE and if that works better than concentrate on that for future development.

KS: Timescales on that? We have future work that will need it

RL: Who should evaluate? Murali, KS, Gabriele

KS: Plus Kay. Goals of the new connection library is to see if that can be put into CSS, starting as a standard project under Phoebus. Trying to move SWT and Eclipse out of CS-Studio, so that you see the Sun again, Phoebus = CSS - (Eclipse + SWT); more rigorous testing, better build system.

MV: Concerned about stability, it will take time to stabilize the code. (by Will Rogers: I agree!)

KS: Discussion for later (dinner-time?). Current status is: very active prototype.

AI on KS/RL: Merge histories for CAJ/JCA. Others will look into JCAE.

Presentations for Saturday

TK: 3 blocks of time. EPICS 7 (2 hrs), lunch, lower-level, coffee, higher-level, break, Council topics.

KS: Request title change, pvData not pvAccess, and it’s only plans at this point.

AJ: My and MD’s talk content may meld slightly, I will discuss with him and we’ll agree how to cover the technical topics.

KS: (shows slides about EPICS 7 CSS topics)

RL: Yes, should be included.

KS: Will send a new title, make it a longer talk.

AJ: What about templates?

TK: (shows powerpoint template slides, choices for content slides) Here you can download the EPICS power point template: https://drive.google.com/drive/folders/0B_S5lwgLONm7bnphMlRlQjdzWHM?usp=sharing

For logos etc:

https://drive.google.com/drive/folders/0B_S5lwgLONm7c3hMbFRpNTBZbjg?usp=sharing

AI on AJ + MD: Discuss splitting EPICS 7 Core topics for presentation on Saturday, AJ to adjust Indico as necessary.

AJ: We will revisit the topic of talks again with HJ, BD and Mark Rivers.

pvAccess: status of name resolution using UDP Multicast?

RL: Issue between Bertrand (ITER) and Kevin (Cosylab) about using pvAccess for name resolution across subnet boundaries. Multicast name resolution isn’t implemented, according to Matej.

MV: Not sure if this was a failure to comunicate.

AJ: Also question about Multicast in CA, I tried to test at APS but don’t know if the failure was related to our network configuration (nothing done specifically) or whether the code fully works.

RL: pva Server registers with Multicast addresses, why it if doesn’t work (or does it)?

KS: It’s under the Channel Search Manager in pvAccessJava

TK: The specification talks about multicast, text in red implies that something is implemented (“simple and robust algorithm”).

RL: Multicast allows different groups of clients/servers to have different name resolution domains on the same network.

AJ: (arguing why in practice multiple naming domains probably not very useful). Someone who has multi-network access needs to check out what works, what doesn’t

AI on RL: I will check next time we have this setup.

AJ: And test MD’s implementation for CA too.

Developer Meetings for 2018

RL: Need to know for planning purposes.

AJ: Roughly 3 per year, how many do we want? Center of gravity still US for Core

KS: CSS is more split, several Europeans. Will want some joint meetings to discuss Phoebus.

TK: Offer to host one meeting at ESS, Spring or summer. June would be good for weather.

KS: Will ask Richard if we can host one at BNL.

RL: BNL in the Fall?

AJ: APS hosting the full EPICS Collaboration meeting June 13-15th, so Core dev’s 11 & 12.

RL: Then ESS October maybe.

AJ: The Fall meeting is in Asia, no idea where or when. Defer BNL until 2019.

KS: ICALEPCS 2019 is in New York.

Break for lunch (long)

Present: +MD, MK, KK

Travis/Jenkins Builds

RL: Move towards using Travis-CI instead of Jenkins. Harder to orchestrate things beyond build+test.

MD: I want a CI builder where I have the ability to control the build process without asking someone else to fix things.

(discussion)

KS: Happy with Travis-CI, but…there are limits, upstream/downstream CI support is not available. You are limited to what docker containers are supported by travis. Release and publishing of documentation jobs are also difficult to support.

RL: Can’t have Travis publish built doc’s to SourceForge without publishing an ssh private key to allow that. We don’t currently have the ability to trigger other jobs.

AJ: We will need to continue using Jenkins to build for VxWorks. Could transfer doc jobs to APS, but RL would lose management ability.

(discussion)

RL: We have offers of servers to run stuff, the issue is administering the software (updates etc.).

KS: Travis has ability to decrypt environment variables for builds, would allow SSH keys. Example (https://github.com/ControlSystemStudio/cs-studio/blob/master/build/travis.sh)

(discussion)

RL: Can we remove any of the Java build jobs from Cloudbees Jenkins?

KS: Possibly, will discuss.

AI on KS: Report jobs that can be removed from Cloudbees, or delete them.

RL: Google cloud now hosts Jenkins. However the free VMs only have 600MB Ram, which is not enough to do anything sensible.

RL: Can’t drop Cloudbees until we have a replacement for Doxygen generation and publishing.

MK: Can we temporarily disable running the failing test, or merge the exampleCPP job to allow the APS Jenkins jobs to run properly with the fixed issue.

RL: Yes OK to merge.

AJ: Note that APS Jenkins doesn’t build pva2pva, so it will still fail until I create that.

MD: Fixing the failing example probably requires changing localhost to use 127.255.255.255 instead.

EPICS 7 RPC Server/Client Discussion

KS: pvAccess ChannelRPC vs REST

MD: Name resolution. Easier to integrate with IOC’s.

AJ: Multiple IOCs run on a single host.

AJ: Why is there no pva client for discovering all pva servers available. It could also then attempt to get a description of each service and its functionality.

MD: pvget uses special code to do ‘ping operations’, not using the regular client API.

Priorities for EPICS 7 features in CS-Studio

MD: Is there a table for displaying a NTTable

KK: There is a Table widget in Display builder which can handle NTTable

KS: There is a VTable widget which can display VTable which has to be created from a NTTable using a formula function.

RL: (wants a tree-structured viewer for PVStructure)

KK: No vType for a tree structure (we have vType for scalars, arrays, Image, Table, but not arbitrary struct), all we get at the moment is a multi-line string.

RL: (also wants to archive structure of arrays of structures, BEAUTY can’t archive these)

KK: It would to totally inefficient; you always store the complete blob even when only one item has changed. InfluxDB only stores scalars over time.

TK: Don’t see archiving blobs as useful.

KK: How about if you split the data before archiving individual channels.

RL: Then they would have to be reassembled for the client to view them. I’ve been selling pvAccess as providing structures, and now I can’t tell them how to archive structures.

TK: We are fixing this for the Archive Appliance.

(more discussion, RL vents steam…)

Driving the requirements of new connection layer library, including types

KS: Part of the Phoebus project, combine vType.pv and pvManager DiiRT into one library. KK created issue ticket with architecture diagram: https://github.com/ControlSystemStudio/cs-studio/issues/1429

MD: What operations will be abstracted?

KK: Low-level with get, put, sync & async, monitor (currently vtype.pv)

Higher Level with caching, queuing, aggregation, event decoupling, formulas (currently pvManager).

MD: Flow control in pvAccess. Monitor event bitmap shows what element changed, and which elements had overruns. Setting queue size to 1 would allow caching client to decimate data on server side. Overrun bit indicates to client that samples were throttled.

RL: Can pvAccess client configure oldest/newest element to be dropped in the server queue? No.

KS: Will go through the process of revisiting the vTypes. Have there been recent changes to the NTypes?

AJ: There are two types of unions…

MD: No difference on the client side for this.

KS: To CSS guys, start at 9am EST tomorrow?

KK: Use the morning to talk about logbook and Masar locally, then we can talk together starting at 9am EST to discuss Phoebus, connection layer etc.

KS: OK. Feel free to update the documents with new topics.

MK: pvAccessJava CA provider is always doing put-callbacks. Need to only do the put-callbacks with the option block=true.

KS: Please create an issue to describe that.

AI on MK: Create on pvAccessJava describing this issue.

MD: Also default behaviour should be not wait for callbacks unless explicitly asked.

Demo from Peter Heesterman: Klocwork Insight static analysis tools, finding issues; useful tool. Haven’t tried Coverity, free. Also tried others (parasoft, …).

MD & AJ have used cppcheck, not had time to use this much.

MD: Having someone to triage these is the main issue.

RL: Seen a project which structured its code to pass the static analysis tests.

PH: This tool gives feedback as you’re working; like having a code reviewer watching all the time while coding. Applying it to a large code-base for the first time can be very painful, have to use judgement which issues to use, which to ignore.

RL: For ITER Codac we use <2 tools> for static analysis.

PH: Not easy to export the result-set. Would core devs be interested if I put the work into it?

MD: Yes, but doing the triage is the hard point. Maybe worth doing occasionally.

Wednesday, Oct 4th

Present: TK, MR, Bruce Hill, RL, Kaz Gofron, AJ, MD, HJ

Presentations for Saturday (revisited)

MD: If PvaPy is presented, p4p should be presented.

(Discussion about why there are two APIs, and is which aspects they differ.)

p4p covers client and RPC server, PvaPy also covers plain server.

MD would like to get more feedback on p4p, wants someone to try both and give an unbiased opinion. Currently only supports client and RPC server APIs.

HJ: Offered to ask Patrick who has used pvaPy to look at p4p as a comparison.

AJ: SNS are now using the server API.

MR: Suggest Tom and Dan from BNL might be willing to do comparisons. I would like to have a Python client viewer in AreaDetector.

BH: SLAC using PvaPy and their own CA client library. Can talk to Matt and the pyDM guys to see if they’re interested it looking at it too.

AJ: If we can get feedback from those 3 people/groups that should give us a good place to move forward from.

AJ: Better title than “Core Features Update”? Maybe PVA API Changes? What should we be calling the V4 modules in EPICS 7?

MD: Presented a new simpler client API, and more detailed documentation for the pvAccess client (as updated with the new ownership rules that don’t contain any reference loops in the internal data structures).

MD: Should also cover QSRV (presented new grouping, configuration of the group using JSON info tags, uses 3.16 multi-locking). Currently can only have one info tag in each record, can modify that to allow it to combine tags (Q:group1, Q:group2 etc).

BH: Asked about other ways to do configuration, in a separate file?

MD: Could parse a JSON file directly, as an alternate way of configuring groups. Want to benchmark this group against the equivalent code

AJ: My talk will cover core restructuring, link support updates, int64, etc.

MD: Mention the C++ api changes too (as will I).

Base Restructuring

RL: Base is now 11 modules, one repo with 4 branches for the old parts; a single checkout with submodules checks out and can build everything. The submodule configuration has side-effects; the parent module always records a specific commit for each submodule, so many commits to the core module will be just submodule updates. Travis-CI builds will always git checkout the tip of each module before building, they won’t rely on the tag in the parent. Travis has no hold-off configuration option that we know of to delay builds when it sees a module update in case other modules are about to be pushed to as well. Merging changes up from earlier branches requires deleting files in the wrong modules and moving them in the module they should belong to.

RL: Tagging the parent also records the exact commits of all the submodules. Travis scripts keep configuration information (branch names etc).

RL: Should we keep creating the Java tarfile?

MD: Defer to the Java guys.

RL: We promised to publish new versions as separate tarfile to work with existing 3.15 and 3.16 Base. Should we keep supporting 3.14?

AJ: 3.16 dropped VxWorks 5.x completely. That tarfile has to include pvCommonCPP

RL: Other topic…

AJ: The pvCommonCPP Makefile already documents exactly which files to add to libCom/osi/os/vxWorks.

AI on AJ: Copy the relevant Boost files into libCom.

AJ: What is the license for those files?

MD: Boost software license, looks like BSD.

AJ: Checked GNU, they’re good with it.

pvAccess: status of name resolution using UDP Multicast?

MD: Doesn’t exist. Uses multicast to talk to itself, over loop-back; ugly hack.

RL: Should not be too hard to add?

MD: Don’t think so. A couple of socket options, and TTL.

AJ: We don’t know if the CA solution works either.

MD: Also want pvAccess name-lookups over TCP.

TK: CosyLab has said that Matej would be available to do work, if the community asks.

RL had an AI yesterday to test (on CA, no point on PVA since it’s not implemented).

Cross-module dependencies that Ralph doesn’t like

RL: (explanation of problem: You need to build the EPICS database even for client-only pvAccess).

MD: I needed initHooks

AJ: Can’t we just move initHooks into libCom?

MD: Yes, want to add more generic facility too, shutdown hooks.

(discussion of readline, debian, anaconda and pyEpics).

RL: Moving initHooks to libCom would fix this.

AI on RL: Implement this move.

MD: wants to see an example of the installable configuration snippets mechanism.

AJ: We will need more library related variables, to get the right set of libraries for a plain IOC, and IOC with pvAccess/QSRV, …

CAS/gdd should install a CONFIG* file (into $(INSTALL_LOCATION)/cfg) that adds “cas gdd” to EPICS_BASE_HOST_LIBS.

AI on RL: implement that.

MD: Asks for an example on how to use the CFG Make variable to install CONFIG files.

AI on AJ: Produce and publish a suitable small example.

Release planning and timetable for EPICS 7.x (.0 or .1?)

AJ: Should we call this 7.0 and keep our convention that .0 releases are not for production?

MD: Wanting feedback

RL: Don’t feel this will be ready for production as released.

MD: Production-ready is something you measure, can’t plan for it. The .0 convention delays people testing this release.

RL: Change the convention of .0 releases to be a warning that downstream modules might not be compatible.

AJ: Implying we can’t release .1 until they are all ready?

RL: Umm.

MD: Main modules would be Asyn, AD, Stream, Seq, Motor?

TK: Is this 7.0 or 7.0.1?

RL: 7.0 (+ third-level .0 for patch-release)

AJ: So the next major release will be 8.x?

RL: Yes

MK: Answer to AJ’s question about a name for the pvData+pvAccess+pvaClient+, how about PVA?

All: Good. The PVA Modules.

AJ: Timetable: Release by when?

RL: Need at least an RC by Oct 31, so ITER testing doesn’t have to be re-done. Bug-fixes and doc changes after that are OK.

MK: I’ve been concentrating on testing for memory leaks and reference loops. Other than that I’m just going to be updating documentation for my modules.

MD: Just need time to review. I have been pushing utilities into pvData, and in the short term this can cause build issues further down the stack.

MK: CA Provider I only changed the reference loops stuff, nothing else

MD: Agree the other stuff is brittle and needs unit tests.

AI on AJ: Will try to get Sinisa to review, and merge if he says they’re good.

AJ: Pre-release would be on Oct 17th then, so all merges need to be complete by then.

RL: Have to finish simulation mode changes by next week then, 2 days, I have tests.

AJ: Final Release November 14th, week before Thanksgiving.

RL: Should I create a graveyard project epics-rip and move unused modules there?

All: Yes.

AI on RL: Register epics-rip project.

Where to put the Gateway (pva2pva)?

MD: Leave it where it is for now, not prepared to move it yet.

RL: Should it be part of the standard build?

MD: Yes, no reason not to (compared to the CA Gateway).

RL: Deserves its own module, but in the next release.

MD: No time for this release. Still has debug prints in it though, if I haven’t done anything with it by the end of next week take it out of the build.

RL: I would move it from PROD to a TESTPROD, so it would still build but not install.

MD: OK.

Overhead of using pvAccess Server instead of AD plugins?

MR: Instead of passing pointers into plugins, should we switch to use pvAccess local servers?

MD: Even local connections currently get serialized (i.e. data copies), so that would be slow.

AJ: Sounds like there could be a utility module that would skip that for local channels.

MR: Want to get around having to copy attributes, maybe using smart pointers.

MD: Suggest change the NDArray class so data storage uses shared pointer.

AJ: Related to Sinisa’s virtual allocation?

MR: Yes, not seen a pull request for that.

MD: I have wanted that in the past.

MK: With plugins you’re already taking care of threading issues, so you could just pass a shared pointer to the whole NTNDArray.

MD: Want to be able to serialize your NDArray into pvData serialization format, without having to copy into a pvData object. Suggest reference counting the pointer to the NDArray, so can copy NDArray without having to copy the image data.

MR: Currently every plugin and driver has its own memory pool, since they likely to be similar. Better to combine them.

AJ: Encourage working with Sinisa to avoid incompatible changes.

RL: Similar requirement for memory allocation & free with connecting to the CosyLab’s NES.

Website / Domain-Name / Hosting

TK: Presented the new logo, website and visual image

RL: Rubik Spiders

TK: Took various content from the existing website to here, in a minimal way, many questions still to answer.

RL: URL is testsite.epics7.org

TK: Structure currently modeled from the existing site. Modules database currently just points to the APS site.

RL: Modules is a typical RDB

AJ: Maybe NoSQL, for flexibility.

RL: Voting on entries, crowd-source the status.

TK: Please inform me if there are any entries that need cleaning up or making private.

RL: 3 items on the front page are not all verbs / nouns / whatever.

TK: (makes RL and AJ editors)

RL: Using Wordpress there are many introductions to using it.

MR: Should we make edits here, or will it be reimported from the old site?

TK: Will not scrape the old site again, make edits here and they will be taken forward.

RL: Hosting?

TK: Idea is to use a commercial host, to distance it from an existing site.

RL: We also have the offer from HJ to offer for free.

HJ: Can offer unlimited space for the next 15 years.

RL: Want to provide binary package distributions. We have found something that can do this for multiple distro’s. Good understanding of releases, packages that belong together. Something like that HJ can provide, but might be hard/impossible/expensive from a commercial host.

HJ: Could make updates of the WP database when new downloads available, etc.

RL: Many people have interest in this kind of thing, but are busy.

AJ: Asked about what targets we would offer packages for…

MR: How does it work if you’re developing modules at the same time as having packages installed?

RL: You can do development using a mixture, you just put newer modules in the RELEASE files before EPICS_BASE, which points to the common install location (/usr/lib/epics).

(wide-ranging discussion, stuff about versioning UI files etc.)

TK: Will take up HJ’s offer to host the website, using the epics-controls URL.

Plan for tomorrow

Core dev’s have time in the morning for developing talks and code; AreaDetector and CS-Studio groups will be meeting in parallel.

In the afternoon, BD should be here and we will have discussions with him present.

Thursday, Oct 5th (pm)

Present: RL, TK, AJ, MD, HJ, BD

Python Interfaces

TK: We have 2 problems, what to present on Saturday, and how to move forward.

Decision: MD will present his API, after the pvaPy update.

Presentations for Saturday (revisited)

Kunal’s talks are split and put around Heinz’ Archive Appliance talk.

Future of micro-benchmarking code

MB code to be taken out of pvAccess (that was the only module using it).

PVA Archiving

Extensive discussion about archiving PVA structures.

Goal: official support for archiving all NT structures (mandatory and optional fields).

User defined “any” structures are deemed too hard to store and retrieve efficiently.