Meetings/Minutes/2009-02-09

OpenDNSSEC > Meetings > Minutes > 2009-02-09

OpenDNSSEC meeting 9/2/2009
---------------------------

1. Minutes: Matthijs

2. System architect.

<Roland> Do we need a system architect?
<Roy> Background: Jakob and I had this idea a few years ago. No implementation. Asked NLnetLabs. They had their own idea, called Masterdont.
I would like to stick to the original design, on the wiki. Jelte raised some questions about cache and other stuff.
<Roland> The system architect has a helicopter view and takes decisions in what parts go where.
<John> Too small a project, there are strong views.
<Stephen> "This is the way, go ahead".
<Roland> Strong views can conflict.
<Rickard> Project manager's task?
<Roland> PM does milestones.
<John> We are all system architects.
<Roy> There is confusion, but no opposing views.
<Stephen> We can always vote.
<Rick> Missing structure, that's why the diagrams.
<Rickard> System design.
<Stephen> We have design, but what is it going to do.
<Jakob> When we started, we had this idea: How to build such a thing? Everybody joining hasn't had the same view. We need to clarify.
<Stephen> For production use, we need to build something with high standards.
<Jakob> Concern: As we grow, we need to do more explaining. This leads to less "real actions". We need to limit the group growth.
<Stephen> Draw the line somewhere.
<Rickard> Use cases.
<Roland> Use cases limit the explaining.
<Roy> Process creep. NLnetLabs is going to implement this. 
<Jelte> KASP. 
<Roy> KASP is a database with information. How it is pulled is up for discussion. But watch out for thick overhead.
<Stephen> But we need documentation.
<Rick> Someone who monitors
<Matthijs> Sideline?
<John> No need for formal roles.
<Rickard> Use cases may fit here?
<Roland> Why I raised this issue: I was worried about divergence. But maybe the issues on the list are already solved. 
I consider myself user. Everybodies coding their own way.
<John> There are meetings in bars.

<Rickard> Agree on consensus on the project. We have them but it is not documented. Do we need a system architect?
<Roy> I volunteer.
<Jakob> Roy or me.
<Jelte> You have to be careful about the implicit information in your head.

<Jakob> We will document this on the wiki.

* Jakob is appointed system architect.
* Jakob will document the implicit information that is not yet documented, that Roland and Stephen mentioned.

3. Marketing Plan.

3a. Use cases.

<Jakob> We had some use cases presented.
<Stephen> This use case was for version 1, important to split version 1 and 2. We are aimint to sign uk. Version 1 
is really what they are looking for. In terms of marketing: We need to step back and look what we are doing. Don't 
do marketing too early. If we can get something together on a small notice, we can do marketing.
<Jelte> Only marketing we should do: there is progress.
<Stephen> Do we need to extract the abstract, mission statement? 
<Wouter> Mission statement is actually requirements.
<Patrik> 
<Rickard> a good web page is good for marketing. But for the current situation?
<Stephen> There could be more there.
<John> 
<Roland> Focus of the wiki is for the project group. We need a website for the users.
<Rickard> Do we need it now?
<Rick> Yes, since RIPE is running away?
<Roy> RIPE is not running away
<Rick> Sure, but the are indicative of the reaction on the website: It is incomplete and makes 
people think this is a dead track.
<Jelte> Show progress
<Roland> Like for example news updates
<Rickard> We have a timeline.
<Rick> And the target group.
<Rickard> We have some info about that. But do we need it now? Too much overhead?
<John> Put the current stuff on the technical page and abstract and stuff on the home page.
<Jelte> Agree.
<Roland> Link to our white paper
<Rickard> Agree: We need a better first page and an extended technical page. Should we publish the use cases?
<Stephen> Yes.
<Jakob> If you don't understand the project right now, you probably should not run this stuff. If we have something, we should tell people.
<Rick> But notify that something coming up.
<Matthijs> I think there is a general agreement on showing progress. 
<Rickard> Timeline?
<Rickard> We need to assign someone to make the marketing/welcoming webpage.
<Rick> I volunteer.
<Stephen> I put the use cases on the website. How?
<Jakob> Put it in subversion, we set up an account.

* Stephen puts his use cases on the wiki.

3b. Market study.

<Rickard> 
<Patrik> I talked to se
<Rickard> Do we need to study the market? 
<Roland> I am giving a presentation and listen to the audience and will forward this to the ML.
<Stephen> I am in contact with uk.
<Rickard> We have to check our own contacts.

* No specific actions needed.

3c. Marketing our product

<Rickard> Later stage?
<Rick> At the hacker conference.
<Roy> I have an issue with cracking OpenDNSSEC.
<Rick> The intention is to crack DNS, not DNSSEC. Creating awareness of DNSSEC as a solution.
<Jakob> Really good idea, but not part of the project. It should not be on our agenda.
<Matthijs> Postpone the decision.
<Roy> My main focus is to implement OpenDNSSEC on uk.
<Stephen> We don't have the effort right now.

* No specific actions needed.

3d. Trademark.

<Rickard> How?
<Patrik> Should we?
<Wouter> We have a website, we should be fine.
<Patrik> In order to have the trademark, we need the name in marketing stuff.
<Rick> To keep a trademark, you must also defend it in court. Nobody in this group seems eager to do so.
<Roy> FWIW, I registered the names (not the trademarks). Nominet is willing to do litigation stuff?

* No special actions needed.

4. Tasks, priorities, scheduling.

<Rick> Not for version 1. That's why I put points 4 and 5. 
These are things that I want to bring to our attention because
being aware of them can help us to make our choices better when we design
or code OpenDNSSEC.
<Matthijs> Not for now.

<Rick> I want to make the point that the current design seems to pop up
events in several places; we could be reinventing wheels and complicate
our lives if we don't keep track of the following points:
 - scheduling time-bound tasks is a solved issue, we can use an off-the-shelf
   realtime scheduler algorithm instead of gradually inventing our own.
 - the events that trigger work may be hard to integrate in such an overviewing
   scheduler if they come from assorted places, each following their own
   business logic and concealing the demands that are about to happen.
 - as an example of a complicating situation, Matthijs has mentioned the
   need to take a system offline for scheduled maintenance.  Getting this
   into a signer's schedule will only work if the scheduler can overview the
   work load ahead.

* No special actions needed.

5. Robustness through redundancy.

<Rick> Robustness. Another issue to keep in mind when designing a solution that can scale
up to the future. Redundancy is required if important domain names like TLDs want to avoid
being dependent of a single piece of hardware.
<John> Eplicit decision to let the operator decide how to run.
<Stephen> We need to set up a operator guide.
<Roland> I would be happy to sponsor these documents.
<Wouter> Important for large companies, but not for "mom and pop bakeries"
<Stephen> What are our target users?

* No special actions needed.

6. Design diagrams.

<Rick> The use cases integrate easily (from Rick and Stephen), but something missing.
1. Adding and removing zones.
<Stephen> Split up verzion 1 in a) one zone, b) multiple zones?
<Roy> If you can sign one zone, you can sign N zones. So no.
<John> A problem is: No frontend KASP.
<Roland> Define interface, clearly define interface API.
<Rick> ...
2. How to schedule hardware, with respect to hardware maintenance downtime?
<Matthijs> Also version 2.
<Rick> ...
3. Change NSEC3PARAM. 
<Jakob> You can change it in the policy, but no rollover
<John> Describing KASP design. At some point, we desovered needed parameters. I want KASP to be minimal.
...
<Roy> Signer Engine is done. Every value it gets it from KASP.
<Jelte> 3 Things: Key rollover, zone data can change, signature expire (sigexp).
<Roy> sigexp is defined by KASP.
<Jelte> Who kicks who when that happens?
<John> Signer engine needs to maintain state.
<Jelte> I have a python script that schedules sigexp.
<Jakob> What component initiates what needs to be documented.
<John> ...
<Jakob> Initially, the signer engine asks KASP information any time. 
<John> I thought it was the other way around.
<Rick> Which ones has intelligence?
<John> My idea, it is KASP.
<Rick> I don't care but I think either KASP or Signer Engine should be fairly stupid, as in, not having an 
overview or "business logic".
<Roy> Version 1: In KASP, cron job if something have changed/
<Stephen> Update once a day.
<Patrik> For se, incremental signing is a requirement.
<Matthijs> It doesn't matter if the key rollover is a day late?
<Jakob> No, it is a static value. For emergency key rollover, roll the signer engine script one more time
<John> This conflicts with my ideas.

<Roy>* presenting a slide
1. Retrieve the zone from the master and put it on disk. dig
2. Get dnskeys from KASP:
	- add published keys
	- update soa
3. Get zone from disk
	- Sign the zone (fully)
4. Write signed zone file

<Wouter> There is no timing here. This is key logic.
<Stephen>
<John> KASP is a db with an API, the KASP enforcer implements the KASP.
<John> 
<Wouter>
<Jakob> How often should we do this?
<Matthijs> Is the KASP Enforcer that John describes the Signer Engine
<John> Probably:)
<Jakob> Has everybody read things on the WIKI?
<Matthijs, Jakob, John> Confusion about terminology.
<John> Stephen and I will update terminology.

* John writes a draft about terminology.

7. System design. 

<Matthijs> *thinking: Are we already at point 7b?

* No special actions needed.

7b. Components. 

<Jakob> 
<Wouter>

* No special actions needed.

7c. 

<Rickard> Data flow diagram to know who talks to who
<John> Yes. some formal API
<Wouter> What have you envisioned for version 1.
<Stephen> 1a: zonefile in, zonefile out. 1b: axfr in, axfr out.
<Jakob> Clear line between KASP enforcer and signer engine.
<Wouter> ...
<Stephen> Serious difference between 1a and 1b?

* No special actions needed.

Lunch

7d. Additional point: wat er verder te tafel kwam.

<John> Enforcer reads policy, tells signer to keep zone signed using this config parameters.
It signals the new config parameters if the configs is updated.
<Patrik> Who updates the wiki?
<John> I'll write a draft.

* John writes a draft about APIs between Signer Engine and KASP Enforcer.

<Roy> What was the conlcusion of the test discussion?
<Stephen> Maybe we need integrity checking, test cases written by someone else.
<Jakob> Like a validation thing with an emergence shutdown?
<Stephen> Yes.
<John> Also, we could do integrity checks on KASP. 
<Patrik> This is different than monitoring zonefile
<Stephen> Who produces logfiles.
<Rickard> v2.0?
<Stephen> This is low hanging fruit.
<Jakob> An audit component?
<Stephen> Yes.
<Jakob> Is it a bump in the wire thing? Stopping the server when something weird happens?
<Stephen> No, monitoring is different than auditing
<Jelte> What log levels do we need. I usually use 5.
<Rick> Distinguish the logging for coding/debugging from that which suitsthe operator
<Jelte> I am logging developer information
<Jelte> 0: nothing, 1: error, 2: warnings, I think, 3: business logic, 4: external calls, 5: very verbose.
<Stephen> I need operator information
<Rick> There are standard levels for that in the syslog.
<Jelte> If we going to use syslog, we can use these levels.
<Jelte> Are we writing this only for unix systems?
<*> Yes.
<Stephen> why?
<Wouter> Windows is not POSIX compliant.
<Jelte> If you doing network code, you want to know when what happens.
<Rick> Some "idiot" will try to run OpenDNSSEC on Windows, meaning there are bound to be people, like 
low-trained MS-only admins, who will want to run it on Windows.
<Patrik> Windows will have a sign interface.

<Rickard> Consensus on easy porting (to Windows)?
<Rick> In my mind people thought it was a good idea to refrain from using code that was inherently Unix-only, 
but CygWin was considered a suitable assumption for those running it on Windows.  As your noted indicate 
however, the attempt to obtain a consensus watered down and was not pulled together by the chair.
<Roy> Just continue building. 

<Rickard> About the auditing component.
<Wouter> configure timing functions to test, for example rollovers.
<Stephen> This is detailed requirements stuff. I can send to the group a list of requirements we came up.
<Rickard> After that, should we have consensus?
<Stephen> Set a deadline for requirements and review.
<Roland> Who edits it?
<Stephen> I'll edit it.
<Jakob> To clarify: this is the project plan requirements document.
<Stephen> Examples: must be able to access syslog, ...

* Stephen adn Rick updates this document with auditing requirements.

8. APIs

<Rickard> This will be discussed outside the meeting.

* No actions needed.

9. Discussion about components.

<Jakob> First component: KASP Enforcer
<John> I define KASP And metastore as one. It is not finalized. Stephen wrote a KSM commandline tool. We have
a partially finished version of KASP. Shaun and John will finalize it.
<Roland> You have a database schemer? Can I use any db
<John> For now, only mysql.
<Stephen> Later on, also sqllite.
<Rick> Why no generic switch.
<Roland> I prefer a db API, so users can choose which db to use.
<John> Last meeting, we said: we default to sqllite.
<Patrik> The db layer can be built at a later point.
<John> The current db layer can be replaced.
<Rick> As a user, I don't want to code.
<Stephen> sqllite is just a file on disk. 
<Roland> Then probably you don't need such a db layer, the discussion is relevant anymore.
<Rick> We will get back to you for more functionality when we start talking about advanced issues, such as 
redundancy.
<Rickard> @John: How long do you need to finish?
<John> I have 4 weeks of funding left. 
<Stephen> We also need to adjust the KSK logic, we now only have ZSK logic. There is a draft about it soon, we will post it.
<Rickard> Can this code be in svn?
<John> Soon.
<Wouter> What about the Enforcer?
<John> In progres.

<Jakob> We need something to reference keys.
<Roy> Ronald can help us out? We want to use PKCS#11. We need a way to identify the HSM and the object on the HSM.
<Ronald> I was responsible for the PKCS#11 library, what was also used on Windows crypto library. This is an arbitrary issue:
We use hashes of keys as identifiers. Keys can be identified by the hash.
<John> path to the library can be identifier of the HSM.
<Roland> Yes.
<Roland> However, multiple tokens is nasty. So you want to have slot and token as id. 
...
<Jelte> Don't use keytags as identifiers. The change and are not unique enough.
<Roland> But the public key data is always unique, so yopu can use the hash (fingerprint) as the identifier.
<Matthijs> I missed something. 
	To summarize: how to identify the HSM? Search it, store the slot and token in the cache. This is the identifier for the HSM.
<Patrik> Who will specify this?

* Jakob does.

<Jakob> Look at the configuration for the Signer Engine. I think it may need adaptations. 
Its on the wiki: http://www.opendnssec.se/wiki/Signer/KeyManagement

<Jakob> The Signer Engine.
<Jelte> The individual parts are there, and now there is a python script that adds them together.
Apart from features like expiration date and KSKs, it's done. Please try and give feedback.
It doesn't scale, you run out of file descriptors. I'll make some tickets and documentation.

* Jelte adds some documentation for his Signer Engine script.

<Jelte> I don't mind to throw it all away, if someone comes up with something better.
<Rickard> Is it much work to upgrade?
<Jelte> Probably not.
<Rickard> Timeline?
<Jelte> Not really. Can be done in a few weeks.

<Rickard> SoftHSM. Almost done. There are some minor tickets.
<Roland> Is it PKCS#11 compliant? You have a single slot and a single token? If so, that creates a new security layer.
It makes more sense to add more tokens: A token is one security world. You login on a token.
<Rickard> And?
<Roland> You can have another token in the same slot.
<Rick> Why need multiple HSMs?
<Roland> No: What are you trying to solve?
<Roy> It is not violating the spec.
<Roland> It is not implemented this way.
<Roy> so, I can have multiple keys on a single slot.
<Roland> token.
<Roy> Yes, but tokens are gone, there are only slots.
<Roland> But you have one or more slots, and each slot has one or more token, which may be switched.
PKCS#11 does not have a way to add a new token.
<Jelte> If you use different set of keys, if you are not a user, this might be a problem.
<Roland> It can only be user, that is in the specs. I can see what you are trying to solve, but try something different.
<Roy> A token can contain multiple objects?
<Roland> Yes. So use something outside PKCS to create tokens.
<Roland> Look at Nciphher for more information, for an example of out-of-band token creation in an HSM.
<Roy> To clarify: You have a session with a slot. So you login to a slot, which contains at least one token. (Without token you cannot login).
<Stephen> What about rollover to HSMs from different manufacturers?
<Roland> PKCS does that for you.
<Rick> Why is the security officer not implemented.
<Rickard> Not needed until now. Just need to create pins. Not a top priority.
<Rickard> I cache object attributes in the db....
<Roland> What?
<Rickard> Everything
<Roland> People should be aware.
<Rickard> Yes there is a disclaimer. SoftHSM is for proof of concept.
<Roland> Is the private key on disk encrypted?
<Rickard> No. This must still be done.
<Roy> Lower priority.
<Roland> But it is really easy to add that.
<Jakob> The people will believe that if they start the program, it is ok to enter the password. When keys rollover, it should be possible to
not enter the password.
<Roland> Yes, that is acceptable.
<Rickard> Conclusion: SoftHSM has to be a bit more secure.
<Rickard> The reason it is not there, is I have incremental designed the SoftHSM.
<Roland> One other point: I have a PKCS#11 compliance test script: I need to know what functions are called. 
Need to talk to Jelte 'ldns' Jansen.
<Jelte> I am only signing, no creating.

<Rickard> next thing for SoftHSM. If the KASP Enforcer creates a key, it won't be inserted in the SoftHSM automatically.
<Roland> Shared mempage?
<Jelte> Synchronize at session end?
<Roland> Yes. Something it is created in one session, does not need to be in another session.
<Rickard> What about ...
<Jakob> We got some feedback from Mr. Leuven. 

<Jakob> PKCS 11 could run over a network if you have a generic proxy that passes PKCS #11 calls over a network.
<Roland> I think such stub/proxy combo's exist-- don't know exactly which are available in the open.
<Roland> Vulnerability will be on the link between the stub and remote PKCS #11 client -- specify very carefully.

<Jakob> It is a requirement to talk to multiple HSMs.

<Rickard> The Inbound and Outbound adapters. We want to postpone these definitions.
<Jakob> For now Inbound: drill, Outbound: nsd. :)
<Rick> Inbound: file, AXFR, IXFR.
<Jakob> IXFR is difficult, because you have to have diffs.
<Jelte> diffs is not that hard.
<Rick> and what about scp?
<Jelte> anything you can do with files, you can built on top of that.

11. OpenDNSSEC v1.0 and v1.1.

11a. How should we fit everything together.

<Stephen> At this stage, it is a bit down the line. Get the groups together.

11b. Testing.

<Stephen> At minimum, we need unit testing. Let's agree on a set of testing tools.
Than there is system testing, test against requirements.
And we have tests for failure (Whitebox).
<Wouter> I think it's important that the program works well if things are failing.

Whiteboard action going on...
-----------------------------------------------------------------------------------
Unit
- Use frameworks like, cunit, c++unit
	<Jelte> Use Automated testing on lightly builds. We use unit tests upon svn checkin
	<Rick> I am paranoid about c++ for softHSM (semantics)
	<Jelte> I don't use c++ at all. KASP en Signing stuff is in c.
- Guidelines

System
- Not always automatic
- Against the requirements
	<Wouter> Fuzztesting.
	<Stephen> Test requirements, corner cases, wrong input
- Memory leeks
	<Rickard> Valgrind?
	<Stephen> You can define which memory manager you use and do these tests automatically.
	<Rickard> There is a difference between systems and components.
	<Stephen> Yes: System testing tests requirements, memleak testing is more related to components

Whitebox
- Expert tester
	<Roland> You need an expert tester (a hacker?)
	<Stephen> You need someone outside the project.
	<Rickard/Stephen> Leave it for the moment.
	<Patrik> I will hire some one to review the SoftHSM code.

Code reviews 
- static code analysing
- manual
	<Stephen> We have a license for code analysing.
	<Roland> I know someone.
--------------------------------------------------------------------------------------

* Stephen and Rick will write these guidelines.
* Roland will review it.

<Jakob> Which language?
<Rickard> c. python? c++ for softhsm
<Roland> Is there need for scripting language?
<Jelte> Scripting language makes it easy to iterate versions.
<Roland> When switch to c?
<Roland> Scripting language adds a new dependency and risk.
<Roland> python is complex, and has probably some security bugs.
<Stephen> Bourne Shell?
<Jelte> Please not, unless it is used for calling other programs.

12. Comparison of the HSMs.

<Rickard> There is a comparison on the wiki. Do we need a more detailed comparison?
<Roland> You will need commitment to review HSMs.
<Stephen> Just publish a list: We have tested these ones with OpenDNSSEC.
<Jakob> They won't give you the cost. That is a good thing to add to the comparison
<Jelte> And how many keys it can store.
<Roland> List prices will probably not be approved.
<Jakob> If they won't give the price without NDA, we won't put them on the list.
<Rick> Back on track: I don't think it adds value.
<Jakob> Manufacturers will 'forget' to tell information. We want to get it "above water"
<Roland> Is it part of this project?
<Stephen> Or just say: These HSMs are on the market, these things are important for running with OpenDNSSEC.
Once you start naming products, there is an implicit commitment.
<Jakob> A list of things that are important won't help people much.
<Roland> I think it does, people have a checklist. You can't have an exhausting list of HSMs.

* Roland writes a buyers guide for OpenDNSSEC HSMs.

<Roy> Maximum key size could be added to the buyers guide, 2K?
<Jelte> Lifetimes of keys as described in rfc 4641 are under attack, lifetimes will go up.
<Patrik> jp consider 4K keys.

* Stephen makes a comparison list between HSMs, with respect to OpenDNSSEC compliance.

13. About the project.

13a. Packaging the code.

<Jelte> What usually happens, how we do it, we publish the distro. And someone else does the spec file.
That works good for nsd/ldns/unbound. I can ask, but it is a volunteers job. (Once we have code).

13b. Statistics.

<Jelte> Is it much work?
<Rickard> No.
<Matthijs> Than let's do it.
<Rick> Anything generic, not personal.
<Jelte> counting page visits?
General consensus: ok.

15. Next Meeting.

Every 2nd week a teleconf. See doodle.
face2face: San Francisco with the people attending IETF. 
face2face: Amsterdam with the people attending RIPE

14. Other questions

<Rickard> Deadline for v1.0. v1.1. at the beginning of April. Any deadlines before this?
<Stephen> How realistic is this? Task to submit the estimate before next monday, and 

<Matthijs> Discussion about dataflow. Are there no conflicting thoughts?

Picture from the paper whiteboard:

user --- pollnow --> KASP  <--- poll regurarly --> db
					    |
					    |
						--- cfg  -->
									Signer Engine  ---> file out* ---> [audit] --> ns

		Inbound Adapter --- zone -->

* AXFR out in v1.1

<Roland> Static linking? Someone suggested that we should static linking PKCS#11. 
<Roy> Reason is it trivial to replace a library which is basicly a proxy.
<Roland> Big advantages for dyn link.
<Roy> Conclusion: Dyn link is fine.
<Rick> as a default, that is. And it's not "fine" it's "the best" option.

<Wouter> Control flow is already discussed.