summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGreg Waines <greg.waines@windriver.com>2018-11-20 12:57:44 -0500
committerGreg Waines <greg.waines@windriver.com>2018-11-26 14:52:55 -0500
commit1ed103250b3d6df2bae92b0cee794bfce835d7af (patch)
tree8704ecc00500872c55bf75ac5ee1b74d112faff7
parentd7732f0238c0b5b09d8cf3ea8fc93c5e49624508 (diff)
Create Spec: StarlingX - Distributed Cloud - Synchronized Keystone
As agreed upon within Edge-Computing meetings, this specification proposes an additional Identity solution for the Edge Reference Architecture; i.e. a 'Synchronized Keystone' solution. This solution addresses Edge-Computing Use Cases where full autonomy is required on network connectivity loss but without the overhead of running an Identity Provider (IDP) presence at each Edge Cloud site. Change-Id: Ie60c324e01c23b262336ce24c481e359c5bd61d7 Signed-off-by: Greg Waines <greg.waines@windriver.com>
Notes
Notes (review): Code-Review+1: Dariush Eslimi <Dariush.Eslimi@windriver.com> Code-Review+1: Bart Wensley <barton.wensley@windriver.com> Code-Review+2: Brent Rowsell <brent.rowsell@windriver.com> Code-Review+2: Ian Jolliffe <ian.jolliffe@windriver.com> Code-Review+1: Tao Liu <tao.liu@windriver.com> Code-Review+1: ana <ana.cunha@ericsson.com> Code-Review+1: Shuquan Huang <huang.shuquan@99cloud.net> Code-Review+1: Ken Young <ken.young@windriver.com> Code-Review+1: Saul Wold <sgw@linux.intel.com> Code-Review+2: Curtis Collicutt <curtis@serverascode.com> Workflow+1: Ian Jolliffe <ian.jolliffe@windriver.com> Verified+2: Zuul Submitted-by: Zuul Submitted-at: Mon, 10 Dec 2018 14:00:17 +0000 Reviewed-on: https://review.openstack.org/619053 Project: openstack/stx-specs Branch: refs/heads/master
-rw-r--r--specs/2019.03/approved/distcloud-2002842-synchronizedKeystone.rst545
1 files changed, 545 insertions, 0 deletions
diff --git a/specs/2019.03/approved/distcloud-2002842-synchronizedKeystone.rst b/specs/2019.03/approved/distcloud-2002842-synchronizedKeystone.rst
new file mode 100644
index 0000000..76cf352
--- /dev/null
+++ b/specs/2019.03/approved/distcloud-2002842-synchronizedKeystone.rst
@@ -0,0 +1,545 @@
1..
2 This work is licensed under a Creative Commons Attribution 3.0 Unported
3 License. http://creativecommons.org/licenses/by/3.0/legalcode
4
5..
6 Many thanks to the OpenStack Nova team for the Example Spec that formed the
7 basis for this document.
8
9=========================================
10Distributed Cloud - Synchronized Keystone
11=========================================
12
13| Storyboard: https://storyboard.openstack.org/#!/story/2002842
14| ( Distributed Cloud Keystone Scalability )
15|
16
17The OpenStack Edge-Computing group has defined an Edge Reference Architecture.
18For Identity Management, it uses Federated Keystone to manage Identity across
19all Edge Clouds. If 'full autonomy' is required at Edge Clouds, this requires
20a Distributed Identity Provider Solution with an Identity Provider (IDP)
21presence at every Edge Cloud.
22
23The Federated Keystone solution makes sense where:
24
25* Integration with an existing IDP infrastructure is already required,
26* In large deployments that would benefit from distributed IDP solutions,
27* Where partial autonomy is acceptable in the presence of edge cloud isolation
28 or
29* The cost of hosting an IDP presence at every Edge Cloud is acceptable for
30 full autonomy.
31
32The OpenStack Edge-Computing group recognizes that there is more than a
33'one-size-fits-all' architecture for the Edge. As agreed upon within
34the OpenStack Edge-Computing meetings, this specification proposes an
35additional Identity solution for the Edge Reference Architecture; i.e. a
36'Synchronized Keystone' solution. In the Synchronized Keystone solution,
37a Synchronization Framework synchronizes the Identity Resources of a Central
38Cloud to all of the Edge Clouds.
39
40Synchronized Keystone provides an Identity solution for the edge where :
41
42* a simpler standalone Identity solution can be used for the edge cloud
43 deployments, and
44* the edge cloud sites are compute-power-limited deployments, e.g. small
45 All-In-One (AIO) simplex / duplex servers, where the cost of hosting
46 an IDP presence in support of full autonomy is too high.
47
48Problem description
49===================
50
51In a distributed edge cloud environment, with 100s or 1000s of edge cloud
52sites, the centralized orchestration of cloud services across all the edge
53cloud sites is imperative for operational usability. This specification
54deals specifically with the centralized orchestration of the Identity Cloud
55Service across all the edge cloud sites.
56
57For the Identity Cloud Service, in a distributed edge cloud environment, it is
58desired to support the same set of Users and Projects across all edge clouds.
59I.e. At any edge cloud, be able to login with the same User name and Project
60name, using the same authentication credentials and getting the same
61authorization capabilities and roles.
62
63Note that for some use cases, network connectivity between the edge cloud and
64the central cloud is not reliable. The Identity Cloud Service at the edge
65cloud must be fully autonomous in the event of network connectivity loss to
66the central cloud. I.e. both Service Users as well as Tenant Users must
67continue to be able to authenticate and be authorized when the edge cloud is
68isolated from the central cloud.
69
70This specification also enables an optimization for orchestration scalability
71in the distributed edge cloud environment. The orchestration of services
72across all edge clouds requires authentication, typically of the same user,
73across 100s/1000s of edge clouds. With the Identity Service's Users and
74Projects now synchronized across all edge clouds, then by additionally
75synchronizing Fernet Keys across all edge clouds, an authenticated Fernet
76Token generated at the Central Cloud can be used at any or all edge clouds;
77reducing the 100s or 1000s of authentication operations to a single
78authentication.
79
80Use Cases
81=========
82
83The requirement for common Identity Users and Projects across all edge clouds
84applies to all Edge Computing Use Cases.
85
86The Use Cases that require full autonomy of edge clouds (in the event of edge
87cloud isolation) are Use Cases where:
88
89* There are both
90
91 * Remote Physical users (at a central cloud site) and
92 * Local physical users (at edge cloud sites).
93
94* All 'userids' are centrally managed for security reasons,
95* At the edge cloud site,
96
97 * When connectivity to central cloud is lost
98
99 * local edge users must be able to manage their edge cloud and workloads on
100 the edge cloud,
101 * ... using their normal userid credentials.
102
103Examples of such Use Cases are:
104
105* Management of Retail Chains (e.g. Walmart)
106* Large Hospital Campus
107* Large Control Plant
108
109These are also Use Cases where the simplicity of a standalone Identity solution
110for the edge would be desirable.
111
112Background
113==========
114
115The Distributed Cloud (DC) sub-project within StarlingX, already supports a
116Synchronization Framework which is used to synchronize Nova, Neutron, Cinder
117and StarlingX resources from the Central Cloud to all of the Edge Clouds.
118
119This Synchronization Framework provides:
120
121* Synchronization Request Management
122
123 * Managing Synchronization Request Message Queues per Edge Cloud,
124 * With retry on failure.
125
126* The Overall Synchronization Audit Sequencing,
127* Connectivity Status tracking for Edge Clouds, and
128* Synchronization Status tracking for Edge Clouds.
129
130For the existing framework, each Service being synchronized implements the
131following within the Synchronization Framework:
132
133* an API Proxy
134
135 * For intercepting Service API calls in order to trigger immediate
136 synchronization to Edge Clouds,
137
138* a DC Orchestration Module
139
140 * For Service-specific details of Service API Request building and auditing,
141 * For managing the mapping of resources in each subcloud to the canonical
142 resource in the central cloud, and
143 * (in future) for dealing with any API / Schema differences between Central
144 Cloud and Edge Cloud (e.g. in Software Upgrade scenario).
145
146Currently the existing Synchronization Framework supports REST API -based
147synchronization of a Service's resources.
148
149For OpenStack Keystone, a REST API -based synchronization approach will not
150work since not all details of Keystone resources are exposed thru Keystone's
151REST APIs, e.g.:
152
153* User-IDs and Project-IDs can NOT be set on POST
154 (required to be synchronized so that Fernet Tokens can be used on any/all
155 edge clouds)
156* Revocation events, generated internally by Keystone to track events that
157 affect token validity, are NOT exposed via Keystone REST API,
158
159Proposed change
160===============
161
162Synchronization Framework Support for Keystone DB-based Synchronization
163-----------------------------------------------------------------------
164
165This specification proposes enhancing the StarlingX's Distributed Cloud's
166Synchronization Framework to support DB-based synchronization of a Service's
167resources.
168
169I.e. use the existing Synchronization Framework in order to leverage the
170existing retry mechanisms, audit mechanisms, synch status tracking, etc.,
171but in this case, the Service Module within the 'DC Orchestration Engine'
172would synchronize DB Records by:
173
174* Directly querying/setting the Services' DB, and
175* Using a new (admin-only) StarlingX DC DB SYNC Service and its REST API
176 on the StarlingX Edge Cloud which exposes the DB operations remotely
177 for synchronization purposes.
178
179The Service's API Proxy triggers an immediate DB sync of the affected row(s)
180of the Service's DB table(s), due to particular API request, while the
181Synchronization Framework's Audit Mechanism (default every 10 mins) deals
182with non-API events, unexpected events and/or errors to ensure required DB
183Table(s) are in-sync.
184
185The following Keystone resources will be synchronized with this method:
186Users, Passwords, Projects, Roles, Role Assignments and Token Revocation
187Events.
188
189Synchronization of Fernet Keys
190------------------------------
191
192This specification also proposes enhancing the StarlingX's Distributed
193Cloud's Synchronization Framework to support API-based synchronization of
194the Fernet Key Repo.
195
196New REST APIs for bulk synching of the Fernet Key Repo, updating the Fernet
197Key Repo (on rotation of keys) and auditing of the Fernet Key Repo are
198added to the STX-CONFIG service.
199
200The Synchronization Framework will be extended to support Fernet Key Repo
201synchronization thru the STX-CONFIG service; adding a Fernet Key Manager to
202the STX-CONFIG DC Orchestration Module for managing the Fernet Key Repo
203synchronization messaging done by the Synchronization Framework.
204
205Alternatives
206============
207
208An alternative solution considered for synchronizing keystone would be to use
209built-in DB synchronization of open-source DBs used within StarlingX for
210the OpenStack Service DBs. I.e. use the built-in DB Synchronization
211capabilities of mariaDB or postgresDB, both of which support replication
212of DB Tables from a single R/W Master to multiple ReadOnly Slaves.
213
214However, the built-in DB synchronization solutions of mariaDB or postgresDB,
215do NOT support the ability of handling different DB Schemas in the Central
216Cloud and Edge Clouds; i.e. required for Software Upgrade scenarios, or even
217just a heterogeneous mix of openstack-versioned edge clouds.
218
219Data model impact
220=================
221
222There are no DB Model changes required to any Services.
223
224REST API impact
225===============
226
227Synchronization Framework Support for Keystone DB-based Synchronization
228-----------------------------------------------------------------------
229
230The following REST APIs were added to the STX-DISTCLOUD service to support
231DB-based synchronization of Services between the Central Cloud and the
232Edge Clouds:
233
234NOTE: These are public REST APIs in the sense that the Central Cloud
235will use these REST APIs to synchronize data to the Edge Clouds. HOWEVER
236these REST APIs are NOT intended to be used by an end user.
237
238* GET /v1.0/identity/users
239
240 * Description: DB SYNC List all identity users
241 * Normal Reponse Codes: 200
242 * Error Response Codes: computeFault (400, 500, …),
243 serviceUnavailable (503), badRequest (400), unauthorized (401),
244 forbidden (403), badMethod (405), overLimit (413), badMediaType (415)
245 * Response Parameters:
246
247 * < all users of the Keystone DB Table >
248
249 * < all the attributes of the Keystone User DB Table >
250
251* GET /v1.0/identity/users/<UUID>
252
253 * Description: DB SYNC Get specific identity user
254 * Normal Reponse Codes: 200
255 * Error Response Codes: computeFault (400, 500, …),
256 serviceUnavailable (503), badRequest (400), unauthorized (401),
257 forbidden (403), badMethod (405), overLimit (413), badMediaType (415)
258 * Response Parameters:
259
260 * < all the attributes of the Keystone User DB Table >
261
262* POST /v1.0/identity/users
263
264 * Description: DB SYNC create identity user (and password)
265 * Normal Reponse Codes: 201
266 * Error Response Codes: computeFault (400, 500, …),
267 serviceUnavailable (503), badRequest (400), unauthorized (401),
268 forbidden (403), badMethod (405), overLimit (413), badMediaType (415)
269 * Request Parameters:
270
271 * < all the attributes of the Keystone User DB Table >
272
273* PUT /v1.0/identity/users/<UUID>
274
275 * Description: DB SYNC update identity user (and password)
276 * Normal Reponse Codes: 202
277 * Error Response Codes: computeFault (400, 500, …),
278 serviceUnavailable (503), badRequest (400), unauthorized (401),
279 forbidden (403), badMethod (405), overLimit (413), badMediaType (415)
280 * Request Parameters:
281
282 * < all the attributes of the Keystone User DB Table >
283
284
285... and similarly for the other Keystone DB Resources
286
287* GET /v1.0/identity/projects
288* GET /v1.0/identity/projects/<UUID>
289* POST /v1.0/identity/projects
290* PUT /v1.0/identity/projects/<UUID>
291
292|
293
294* GET /v1.0/identity/assignments
295* GET /v1.0/identity/assignments/<UUID>
296* POST /v1.0/identity/assignments
297* PUT /v1.0/identity/assignments/<UUID>
298
299|
300
301* GET /v1.0/identity/token-revocation-events
302* GET /v1.0/identity/token-revocation-events/<UUID>
303* POST /v1.0/identity/token-revocation-events
304
305Synchronization of Fernet Keys
306------------------------------
307
308The following REST APIs were added to the STX-CONFIG service to support
309synchronization of Fernet Key Repo between the Central Cloud and the
310Edge Clouds:
311
312NOTE: These are public REST APIs in the sense that the Central Cloud
313will use these REST APIs to synchronize data to the Edge Clouds. HOWEVER
314these REST APIs are NOT intended to be used by an end user.
315
316* POST /v1/fernet_repo
317
318 * Description: Distribute fernet repo
319 * Normal Reponse Codes: 201
320 * Error Response Codes: computeFault (400, 500, …),
321 serviceUnavailable (503), badRequest (400), unauthorized (401),
322 forbidden (403), badMethod (405), overLimit (413), badMediaType (415)
323 * Request Parameters:
324
325 * Content-Type application/json
326
327 * Style: Plain
328 * Type: Xsd:String
329 * Description: The list of Fernet Keys.
330
331* PUT /v1/fernet_repo
332
333 * Description: Update fernet repo with keys
334 * Normal Reponse Codes: 202
335 * Error Response Codes: computeFault (400, 500, …),
336 serviceUnavailable (503), badRequest (400), unauthorized (401),
337 forbidden (403), badMethod (405), overLimit (413), badMediaType (415)
338 * Request Parameters:
339
340 * Content-Type application/json
341
342 * Style: Plain
343 * Type: Xsd:String
344 * Description: The list of Fernet Keys.
345
346* GET /v1/fernet_repo
347
348 * Description: List contents of fernet_repo (the keys)
349 * Normal Reponse Codes: 200
350 * Error Response Codes: computeFault (400, 500, …),
351 serviceUnavailable (503), badRequest (400), unauthorized (401),
352 forbidden (403), badMethod (405), overLimit (413), badMediaType (415)
353 * Response Parameters:
354
355 * Fernet_keys
356
357 * Style: Plain
358 * Type: Xsd:List
359 * Description: The list of fernet keys
360
361Security impact
362===============
363
364This work only impacts security in a Distributed Cloud environment.
365
366In a Distributed Cloud environment, this work directly manipulates Identity
367data by synchronizing selected Keystone resources and Fernet Keys between
368the Central Cloud and the Edge Clouds.
369
370The only external impact is that in a Distributed Cloud environment,
371a Token created on any Cloud (Central or Edge) can be used on any or
372all Clouds (Central or Edge).
373
374Other end user impact
375=====================
376
377This work only impacts end user in a Distributed Cloud environment.
378
379In a Distributed Cloud environment, a user can indirectly interact with the
380feature when using ANY OpenStack Service API across Edge Clouds by
381leveraging the fact that a Token created on the Central Cloud can be
382used on any or all Edge Clouds.
383
384In a Distributed Cloud environment, in an edge cloud network isolation
385scenario, an end user, local to the edge site, can now login / authenticate
386with his normal userid and credentials and manage his workloads.
387
388Performance Impact
389==================
390
391This work only impacts performance in a Distributed Cloud environment.
392
393Overall there is a reduced amount of synchronization messaging between
394the Central Cloud and the Edge Clouds in a Distributed Cloud Environment.
395
396Logically more data is being synchronized; i.e. Fernet Keys and selected
397Keystone DB Resources, in addition to the existing selected STX, Nova,
398Neutron and Cinder DB Resources. However with the ability to use a
399single Token, generated on the Central Cloud, for ALL Edge Cloud
400synchronization messages, this drastically reduces the Synchronization
401Framework messaging.
402
403Other deployer impact
404=====================
405
406There are no deployer impacts with this work.
407
408Developer impact
409=================
410
411In a Distributed Cloud environment, developers implementing new services
412that orchestrate across all Edge Clouds should leverage the fact that
413a Token created on the Central Cloud can be used on ANY / ALL Edge Clouds,
414in order to reduce their messaging impact on the system.
415
416
417Upgrade impact
418===============
419
420In a Distributed Cloud environment, there are upgrade impacts with this work;
421i.e. when upgrading from OpenStack Version N to OpenStack Version N+1.
422
423This work is sensitive to any Keystone DB Model changes. However the
424architecture of the DB-based synchronization within the StarlingX
425Distributed Cloud Synchronization Framework does support the ability
426to manage DB Schema changes between the Central Cloud and the Edge Cloud.
427This was one of the major reasons for choosing this approach.
428
429The plan for Software Upgrades (from one OpenStack Version to another), in
430a Distributed Cloud environment, is that the Central Cloud will be
431upgraded first to version N+1, and then the Edge Clouds.
432
433If the Keystone DB Schema changes between version N and version N+1,
434the N+1 version of Distributed Cloud Synchronization Framework must
435implement the Keystone DB Schema conversions between N+1 and N,
436for all synchronization messages during the Rolling Software Upgrade
437across the entire Distributed Cloud system.
438
439Implementation
440==============
441
442Assignee(s)
443===========
444
445Primary assignee:
446 Andy Ning
447
448Other contributors:
449 Tao Liu
450
451Repos Impacted
452==============
453
454Repositories in StarlingX that are impacted by this spec:
455
456* stx-distcloud
457
458Work Items
459===========
460
461Synchronization Framework Support for Keystone DB-based Synchronization
462-----------------------------------------------------------------------
463
464* Introduce dbsync agent/api on sub cloud, and add it to starlingx as a new
465 service,
466* REST APIs between dcorch engine and dbsync agent (POST/PUT/GET),
467* Implement dbsync client to wrap dbsync APIs into python functions,
468* Enhance identity module within dcorch engine to do DB based resource
469 synchronization,
470* Enhance identity module within dcorch engine to do DB based resource audit,
471* Add new resources to be synced (token revocation events),
472
473 * NOTE: that current code is synching users, passwords, projects, roles and
474 role assignments ... albeit using API-based synchronization,
475
476* Deployment and configuration of new StarlingX DistCloud Services,
477* Unit test.
478
479
480Synchronization of Fernet Keys
481------------------------------
482
483* Add new stx-config APIs (POST) for central cloud to distribute fernet repo
484 including RPC between stx-config API and conductor,
485* Add new stx-config APIs (GET) for central cloud to audit existing keys
486 including RPC between stx-config API and conductor,
487* Add new stx-config APIs (PUT) for central cloud to update repo with keys
488 including RPC between stx-config API and conductor,
489* stx-config internally, safely retrieve and update fernet keys,
490* Enhance stx-distcloud orch engine (or cron job) to rotate keys and
491 call stx-config APIs to distribute new keys,
492* Enhance stx-distcloud orch engine to audit fernet keys across managed
493 sub clouds, and call stx-config APIs to distribute keys if mis-matches found,
494* Enhance dc manager to trigger key distribution when a sub cloud becomes
495 managed,
496* Add logic to stx-config to empty and re-setup fernet repo locally when
497 receive an empty POST,
498* stx-config/stx-metal/stx-distcloud unit test (Tox),
499* Manifest for fernet repo and keys creation during deployment may not need
500 any changes on both central cloud and sub clouds.
501
502Dependencies
503============
504
505There are no external dependencies for this work.
506
507I.e. there are NO requirements on changes to OpenStack Keystone.
508
509Testing
510=======
511
512Need to do explicit testing of Fernet Token synchronization and Keystone
513DB Resource synchronization between Central Cloud and Edge Clouds.
514
515Need to do COMPLETE regression of StarlingX Distributed Cloud (DC)
516functionality.
517
518Should qualitatively evaluate performance / messaging scalability
519improvements before and after this work.
520
521Need to do a SANITY regression of StarlingX in an NON-DC environment.
522
523Documentation Impact
524====================
525
526Currently there is no documentation on the StarlingX Distributed Cloud
527functionality. When this documentation is created, the work of this
528specification should be described at a functional level.
529
530References
531==========
532
533None.
534
535
536History
537=======
538
539.. list-table:: Revisions
540 :header-rows: 1
541
542 * - Release Name
543 - Description
544 * - 19.03
545 - Introduced