EPICS V4 Developer Meeting at APS

Agenda, Tuesday 17th November 2015

9:00        Welcome from Richard Farnsworth

…        Develop Agenda for today (Bob & Guobao arrive tomorrow)

Agenda:

Admin and Process

Technical planning

Minutes

Present: GW, TK, KK, MP, DH, MD, KV, MK, RL, AJ, SV, AA (remote)

Chair: AJ/GW

Scribe: SV

Richard gives welcome address. No dancing this time…

KK: Discusses APS v4 use cases

TK: Timeline for APUS

SV: 2021 APS starts replacing machine, 1 year timeframe

RF: It is doable

GW: Let AJ drive

AJ: Let’s decide on agenda…

DH: Wants to clarify our process regarding API changes, etc., how do we manage pull requests, what goes into releases.

RF: this is more important than nuts and bolts, as organizations will look into this before deciding to go with v4

KK: these things could take 3 days… if we could figure out before hand

GW: The whole agenda breaks up into technical interests, and process; put half a day for each of those

Original list of meeting items:

https://docs.google.com/document/d/1SB8pxhSy9bYSndUePvT9LJE-Fj_wQw1GeX5sxpKJvj0/edit#heading=h.opgr8wcl9spj

GW: Editing proposed agenda, everybody anxiously waiting...

Admin and Process

AJ Slides: See part of ANJ-Slides.pdf

AJ: presents slides of how things went;  we had 2 months delay from the original timeline

AJ: “We obviously have issues”

AJ: Is distributed code ownership working?

DH: Part of the reason for delays: doxygen/java doc errors that had to be fixed; it was not process, but we had documentation issues

AJ: We are not blaming you, just that our process for deciding when to release did not work.

DH: The only hard deadlines were Ralphs release dates, ICALEPCS. We do need to know why it took 2 weeks for announcement

GW: Underlying reason: we have very few customers other than us, we had other things to do (we are releasing to ourselves)

KK: Nobody cares that v4 slips for a month or two because nobody uses it; people would care if epics base would include our stuff

AJ: That is not going to be possible, let’s come back to that. Issue of distribution like source code vs packaging. Guys who do source code should not package.

KK: we do not have packaging guys.

GS: FRIB is discussing how we can get all V4 packages officially. We want an official one instead of creating local packages

GW: If merger into base is imminent, we should not waste time fine tuning our release process.

RL: I can’t even start testing until code freeze. Having a stage thing where early modules come week earlier would be better.

AJ: proposes splitting timetable by modules

RL: No module in the first phase should depend on modules in the second phase, etc. Basically, 2-stage process

GW: We should not try to stick to dates for individual modules

RL: Sees 2 phases, GW sees 3

MK: Until pvData is not used by other modules, it is hard to complete integration tests; we should have option for multiple pre-releases if need be; if developers keep up on the master, there should be few surprises

MD: This is a big assumption

GW: Things did work ok this time.

MK: Have multiple pre-releases, and drop release candidate (RC)

GW: we need RC still

DH: agreed

RL: feature freeze is pre-release, no changes in doc after RC1

GW: Conclusion: Split up release to multiple stages, we need tree of dependencies

RL: Owner of later modules need time to do testing, or I do not get to take vacation

So, to solve the problem that developers of downstream modules have, in the present process, are completely dependent on upstream developers completion on the same day as they have to release. We need to separate upstream from downstream.

RESOLUTION: For pre1 we define 2 phases with dates for each phase with 1 week between each phase:

Phase 1: Java and CPP for pvCommon, Data, Access, NT.

           Phase 2: everything else

ALL: Agreed

MD: We need testing process, regression tests

MK: Does RL proposal apply to all, PRE, RC, etc.?

RL: This must apply to PRE1. Lets try it for PRE1 first, see it it works.

DH: we need phased API freeze

AJ: before you create tag on the branch before the PRE1 release, RL needs branch name for all upstream modules before creating build jobs

GW: We do not have cheat sheet on how to drive git, jenkins, etc.

AI on GW: Make his cheat sheet on release process public.

AJ: Nice book on git: https://git-scm.com/book/en/v2 

http://xkcd.com/1597/

DH: When we have deadlines, should we plan what goes into what release, vs pushing button on a given date.

AJ: We should get stuff in rather than push the button.

DH: Consequence: when I delay PvData, I delay everything else. What to do then?

DH: When we did RC, this slipped 3-4 days; time budgeted for documentation disappeared because we did not adjust final release dates.

AJ: Fixed dates are probably wrong.

KK: We should not fuss about 4 days for a past release…

AA: What about RL French vacations (2 months or so)?

---- break ----

TOPIC 2 of: Admin and Process 

DH: Main issues are:

Process (how features and core changes are being introduced) is not stable enough. Management is under the impression that our processes are not well-governed.

GW: The process (and practices) should be clear, and the thing you mention was not covered.

DH: Second example:

MD: Should we enforce proper review process.

DH: Time needed for review depends strictly on the module and the nature of changes.

KK: Without many users, V4 could still be in the storming phase.

GW: What about: “every pull request needs 24hrs, major requests need a telecon”.

MD: Required sign-offs?

AJ: This applies to master (where changes go in), not to release branches (where disruptive changes are not allowed).

MK: Changes in central modules (pvD, pvA) should always have a discussion.

DH: Should have better release planning, in terms of which features are going into which release. The move to GitHub has opened new methods to prepare drastic changes.

KK: In CS-Studio, the rule of “each PR needs a ticket” improved the documentation of changes.

One gatekeeper is in control of merging all PRs.

GW: We discussed that and decided differently.

MK: The pipelining feature was discussed in a sub-group.

DH: But all things that go into public repos must be publicly discussed.

AI on GW:

AA:can’t hear, can’t see.

Misunderstanding. Was planned for a long time (sums up the requirements)

Was prototyped, then BD saw it working and proposed merging it. AA was not aware of the procedures needed.

DH: Does pipelining include a protocol change?

MD: It’s a compatible addition. Client has to explicitly turn it on.

MK: What behind the two terms “pipelining” and “streaming”.

MD: Pipelining is the client being able to slow down the server’s rate of updates on the wire. Client acks each update, and server queues updates for each client so that each client receives every update, no loss.

KK: This is what we want, and called “streaming” because that term was used in further processing stages for the neutron event data. And should rather call “pipelining”. We used V4 publish/subscribe to send all detector events to the next processing stage. Pub/Sub works well, but pipelining would potentially assert that it works even better.

DH: It was not clear that this was the same issue that was discussed as a minor point back at the Saclay meeting.

GW: But there is no QoS agreement, right?

MD: No. It’s the mechanism, not a policy.

GW: And the pipelining metadata?

MD: There is none. This is on a low level, comparable to TCP “windowing” in the network model.

KK: Is there a description of this?

MD: Of course not.

GW: Please prepare something. This afternoon?

MD: OK.

Technical planning 1:

GW:

DH:

GW: There is actually no shortage of online documents

RL: Considering limited resources, adding a “Read these 5 essential pieces of documentation” started doc would help improve documentation for newcomers

DH: pvDatabase channels currently limited to get/put, no RPC support

KK: What would RPC call to pvDatabase do?

AJ: Motor record would run motor over complete velocity profile. RPC sends profile

MK: RPC could be deprecated because put-get with arbitrary union as parameter/result has same effect

AJ: Point for RPC might be to have well defined parameter & result types, not arb. uni.

MK: put-get can still do that...

GW: SLAC is using RPC services. Interested in 100% compatible API where RPC might, under the hood, be implemented as put-get.

KK: Examples for RPC are a lot easier than pvDatabase examples. Why deprecate a working and comprehensive API?

GW: Can the same channel name be used for pub/sub and RPC?

MK, DH: pvAccess could support this (one Channel Provider for both types of operation), but pvDatabase does not at this time

A quick review of what we’re not calling pvAccess Streaming (ie Pipelining)

MD:

Based on ideas from TCP flow control.

There is a sequence num for each channel. Whenever sender sends a monitor event, the sequence number is incremented. The client must acknowledge each event, but ack msg may be for several events at once.

At some point if the client is running slower than the server and there are unacknowledged events in the pipeline, the “window size” limit for the channel may be reached. The window size is set by the client at connection time, and there can be at most the window size number of unacknowledged events held in the buffers.

At that point, the server is not permitted to send any more events until it receives an “ack” msg from the client confirming that the client has accepted more events.

 

The typical buff size is like 4 (updates), and so the window size in pvAccess pipelining is also about 4.

This all means that in the pvAccess connect init, the client says what the window size will be, in units of number of complete discrete monitor updates.

The default is not to do pipelining (obviously), but if c does choose to, it must give a window size. The window size may be 1, or >1. MD can’t remember, but the def may be 4.

In case of many clients, the strategy for how to decide which client to which to, is application specific. Possibly s should listen to all c, or listens to one important c. But presently there is no implemented policy and no supporting API to manage many clients.

AJ: Google “buffer bloat” for reasons why window limit best kept small (more buffers means more latency as the queue gets drained).

KK: How to enable?

MD: Client sets flag in request.

Basically “pvget -m -r SomeFlagToEnablePipeline channel_name”

MK: pvDatabase is unaware. Does it need to throttle, i.e. _not_ change data, because server cannot send changes out?

MK: suggests there must be some msg that tells the s to throttle. So there is a missing method in pvAccess that would tell pvDatabase whether it can process.

KK: suggests then we need some method that gives queue state indication, so high level sw can determine whether it must soon, for instance, put in an experiment shutter and so stop data. MD calls such an indicator “watermarks”.

KK: What was original use case?

MD: Call to a traditional database. Not using ChannelRPC but monitor where the request contains the query, and the data for rows is returned as a sequence of monitor events. Result from RDB or NoSQL is read from a cursor, stepping as fast as possible based on ack’ from client, not losing any data row.

TK: how about if the client can tell server that it MUST get all data, or that it may drop?

MD: Yes, that would be good, but isn’t implemented

KK: We have 2 different use cases.

  1. Query where client needs to receive a complete answer. May be a large answer, but it’s a brief burst of data. Client will not send another request until it has current answer. Client is self-limiting.
  2. Beam line data acquisition is rather steady flow of events. Stuffed pipe or slow client will not automagically reduce number of events.

In SNS detector use case, c does at least know when it is loosing data (because of overflow bit set by pvAccess, plus also another sequence number in our data), so could then close shutter or use slits to reduce number of events.

MD: Pipelining would still help in case 2 to better handle brief bursts of higher event rates.

MD: Can be solved by adding an API call the c can use to find out the “watermark” - the watermark would contain, eg, the number of free buffers. Then if the number goes < say 20%, that would indicate that the server should start throttling data.

Idea that perhaps, as well as an API call to determine the watermark (num buffers free data), there could also be a new field to the pvData structure returned which gives the watermark. In context of N-types, it would be a “standard optional” field. Then standard clients could make widgets that show the state of the pipeline headroom.

GW: Presumably adding watermark call would be easy?

MD: Yes, very

DH: At the moment experimentalists want to know how many frames they’ve lost. What’s proposed has the property of being lossless [or at least potentially lossless - given the client can take action to slow does the data being sent]  - and that is certainly desired by experimenters.

RESOLUTION: We’ll add “watermark” information to the pvAccess exchange to facilitate lossless “pipelining”.

AI on MD or MS: Specify additions necessary to add watermark support to pvAccess.

NEW TOPIC of Technical Planning: SNS points of interest (Kay):

Kay’s highest priority is the pvAccess gateway. His group doesn’t have big need for dbGroup.

KK: Gave his talk from Melbourne and FRIB about transporting SNS Neutron Data.

Discussion during KK’s talk:

KK: Points out that pvAccess -m can result, when the data is really old, in the client seeing very old data “without realizing that”.

RL: Yes, that’s bad particularly for archivers.

DH/MP: work around of SNS - when you acquire data, make sure to only store the data received from that point on.

Bottom lines:

SNS exp of V4 was that network capability exceeded requirement of 10M SNS events/s. Each SNS event is time of flight + pixel ID, ~8 bytes. 1Gb network could do 15M, 10Gb network could do 100M. Tools and robustness to client and server operational restarts, was very good. System is installed on 5 beamlines.

NEW TOPIC pvGateway

MD presenting details on pvGateway work. There is a new “Gateway Spec” document detailing desired features and suggested implementation.

Document: pvagw-design-20151118.pdf

KK: rate limiting is useful.

MD: Detailed gateway connection protocol and use of gateway cached data.

AJ: We use a gateway with a name server to avoid passing broadcast to secondary network. RL: PVAccess already handles this.

AJ: We also exclude some PVs from CA gateway searches using regular expressions, both positive and/or negative.

AJ: Initial CA requests start off at ~25Hz.

RL: Clients will always retry no matter what gateway does to rate limit.

RL: New PVAccess use-case: new request for sub structure, even though an existing connection may exist. MD: will check for this in code.

MD: Reference counting for connected channels. Plus garbage collection loop for unused channels on server side. Disconnects on client side will cause cause gateway to drop channel.

KK: Important thing about GW is memory management. Does make sense to use Java instead of C++?

MD: Possible to do mem management to avoid out-of-mem with C++.

MK: Using shared pointers should protect us from memory leaks.

MD: Also has checked for cyclic memory links.

MD: In order to complete the gateway functionality pvAccess needs facility that notifications from pvAccess server to pvAccess client, that a channel has been closed.

GW: Asked about funding for Michael.

TK: ESS is prepared to fund the PVAccess gateway project to completion of production quality release.

MD: No PVAccess FORTRAN binding.

MD: No caching for Get/Put/RPC operation.

AJ: Original CA gateway responded to gets with last monitor value which is not the best idea. But that can be disabled.

RL: This was about reducing traffic on IOC side. What would be useful is if client could specific cache/no cache.

AJ: get-callbacks have to passed through to server side. (all gets are get-callbacks).

GW: Main use of gateway is for network partitioning. Supporting RPC would allow control across the gateway.

MD: Yes, it would. That’s something the writer of the policies must be aware of.

RL: Could block all RPC data.

MD: Main application is for sharing monitors. Gets messy when multiple channels exists for sub-structures.

RL: Timestamps must be preserved for multiple clients accessing same server side data.

RL: CA gateway kept channel open but dropped the monitor subscriptions. So on a reconnect on client side the gateway would just have to reestablish monitor.

DH: Gateway is useful for areaDetector.

MD: Gateway is useful for all applications to share monitors. Cached data will be shared between multiple clients.

MD: New proposed feature is “Channel Transmit Queueing”. This avoid high rate channels starving out low rate channels. Give each channel a separate FIFO, rather than one FIFO shared between channels.

KK: Timestamps are out of order.

RL: Within a channel they are still in order - this is guaranteed. Timestamps between channels are not in order.

KK: Does the FIFO per channel slow things down.

MD: Only need to query FIFOs for data if they are not empty. Event driven from FIFO side.

MD: Need to be careful about network loops - server needs to be able to find out where requests originate from, and automatically ignore some to avoid loops.

KK: Does not prevent issues between multiple gateways, only within a gateway.

MD: Don’t have a solution for that. Need to configure the network appropriately to avoid this.

AJ: Can’t dynamically add a new interface to CA gateway.

MD: As much as possible everything will be runtime configurable.

RL: If I had time the CA gateway would be rewritten as an IOC to enable online setup. Plus good console interface, to get a list of connected clients for example.

MD: Policies: access control. Need to block puts. Would prefer rules based approach, like iptables, rather than CA access control style.

KK: How to handle data. Would gateway subscribe to whole structure to deal with substructure requests from multiple clients?

AJ: In general you wouldn’t want to automatically subscribe to whole structure in case it’s a large amount of data. If the client only wants a small piece of data, the gateway should only handle that small piece of data.

AJ: PVAccess and monitors only send changed data.

MD: Gateway would only keep the changed data, after the first update (which would need to be complete). Monitor sharing code not done yet.

DH: Is it possible for gateway to use mask to only subscribe to substructure? PVAccess change?

GW: May need a PVAccess change in future.

KK: Version number in protocol?

RL: Version numbers will complicate things, especially for gateway.

MD: Will need limits on #clients, #channels per client, #operations, #monitor queue depth.

RL: May need scoped rules rather than monolithic list. Reduces number of rules that need to be checked per connection.

AJ: Positive and negative rules may be useful. eg. allow everything, except something in particular.

RL: Could use scoped rules to aid in restricting name resolution broadcasts to particular networks. BESSY has lots of networks, interconnected by many gateways. Have redundant data copies in each gateway. Would avoid this by having cached data in gateway.

AJ: May have complex configuration file, but gives us lots of options.

RL: Set of gateways would ideally act like a network switch where many connections all go into one process, shared data cache and rules.

AJ: Shall we call this a PVSwitch?

MD: My name is PVAToPVA.

KK: BGE - BestGatewayEver.

MD: Need to configure queue sizes.

GW: Pipelining through the gateway?

MD: Would need to define a strategy. Gets complicated. Pipelines could go both ways through gateway. Could define amount of memory used for each channel.

GW: With put-get, could two different clients request data data but get different values back.

AJ: Yes, but both are atomic on server side, since put-get is atomic.

MD: Yes we want to do pipelining through gateway.

GW: Would need to control put-get operations to control through-gateway control.

MD: Demo - soft IOC with pvaSrv producing one PV called ycnt. Gateway demo has IOC shell console. Gateway is also doing alias called xcnt mapped to ycnt. Current implementation limited to connection sharing. Monitors are not cached/shared.

MD: Currently not possible to bind client and server to different network interfaces.

MD: Code is on MDs github, but not advertised yet.

AJ: Can we get a PDF of the gateway document?

MD: Yes.

MD: Monitor sharing not implemented, but starting to work on this.

Consensus: Demo showed progress with beginnings of a gateway. Great start.

MD: to give AJ his requirements do to be put in pvDataWWW.

NEW TOPIC: Demo of Neutron Data Server

nice demo showing pvData structures over pvget of neutron data, and the CSS screens of exp control and event plots.

Meeting adjourned 4:45pm