435 lines
19 KiB
ReStructuredText
435 lines
19 KiB
ReStructuredText
|
=====================================================================
|
||
|
Everything you never wanted to know about kobjects, ksets, and ktypes
|
||
|
=====================================================================
|
||
|
|
||
|
:Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
||
|
:Last updated: December 19, 2007
|
||
|
|
||
|
Based on an original article by Jon Corbet for lwn.net written October 1,
|
||
|
2003 and located at https://lwn.net/Articles/51437/
|
||
|
|
||
|
Part of the difficulty in understanding the driver model - and the kobject
|
||
|
abstraction upon which it is built - is that there is no obvious starting
|
||
|
place. Dealing with kobjects requires understanding a few different types,
|
||
|
all of which make reference to each other. In an attempt to make things
|
||
|
easier, we'll take a multi-pass approach, starting with vague terms and
|
||
|
adding detail as we go. To that end, here are some quick definitions of
|
||
|
some terms we will be working with.
|
||
|
|
||
|
- A kobject is an object of type struct kobject. Kobjects have a name
|
||
|
and a reference count. A kobject also has a parent pointer (allowing
|
||
|
objects to be arranged into hierarchies), a specific type, and,
|
||
|
usually, a representation in the sysfs virtual filesystem.
|
||
|
|
||
|
Kobjects are generally not interesting on their own; instead, they are
|
||
|
usually embedded within some other structure which contains the stuff
|
||
|
the code is really interested in.
|
||
|
|
||
|
No structure should **EVER** have more than one kobject embedded within it.
|
||
|
If it does, the reference counting for the object is sure to be messed
|
||
|
up and incorrect, and your code will be buggy. So do not do this.
|
||
|
|
||
|
- A ktype is the type of object that embeds a kobject. Every structure
|
||
|
that embeds a kobject needs a corresponding ktype. The ktype controls
|
||
|
what happens to the kobject when it is created and destroyed.
|
||
|
|
||
|
- A kset is a group of kobjects. These kobjects can be of the same ktype
|
||
|
or belong to different ktypes. The kset is the basic container type for
|
||
|
collections of kobjects. Ksets contain their own kobjects, but you can
|
||
|
safely ignore that implementation detail as the kset core code handles
|
||
|
this kobject automatically.
|
||
|
|
||
|
When you see a sysfs directory full of other directories, generally each
|
||
|
of those directories corresponds to a kobject in the same kset.
|
||
|
|
||
|
We'll look at how to create and manipulate all of these types. A bottom-up
|
||
|
approach will be taken, so we'll go back to kobjects.
|
||
|
|
||
|
|
||
|
Embedding kobjects
|
||
|
==================
|
||
|
|
||
|
It is rare for kernel code to create a standalone kobject, with one major
|
||
|
exception explained below. Instead, kobjects are used to control access to
|
||
|
a larger, domain-specific object. To this end, kobjects will be found
|
||
|
embedded in other structures. If you are used to thinking of things in
|
||
|
object-oriented terms, kobjects can be seen as a top-level, abstract class
|
||
|
from which other classes are derived. A kobject implements a set of
|
||
|
capabilities which are not particularly useful by themselves, but are
|
||
|
nice to have in other objects. The C language does not allow for the
|
||
|
direct expression of inheritance, so other techniques - such as structure
|
||
|
embedding - must be used.
|
||
|
|
||
|
(As an aside, for those familiar with the kernel linked list implementation,
|
||
|
this is analogous as to how "list_head" structs are rarely useful on
|
||
|
their own, but are invariably found embedded in the larger objects of
|
||
|
interest.)
|
||
|
|
||
|
So, for example, the UIO code in ``drivers/uio/uio.c`` has a structure that
|
||
|
defines the memory region associated with a uio device::
|
||
|
|
||
|
struct uio_map {
|
||
|
struct kobject kobj;
|
||
|
struct uio_mem *mem;
|
||
|
};
|
||
|
|
||
|
If you have a struct uio_map structure, finding its embedded kobject is
|
||
|
just a matter of using the kobj member. Code that works with kobjects will
|
||
|
often have the opposite problem, however: given a struct kobject pointer,
|
||
|
what is the pointer to the containing structure? You must avoid tricks
|
||
|
(such as assuming that the kobject is at the beginning of the structure)
|
||
|
and, instead, use the container_of() macro, found in ``<linux/kernel.h>``::
|
||
|
|
||
|
container_of(ptr, type, member)
|
||
|
|
||
|
where:
|
||
|
|
||
|
* ``ptr`` is the pointer to the embedded kobject,
|
||
|
* ``type`` is the type of the containing structure, and
|
||
|
* ``member`` is the name of the structure field to which ``pointer`` points.
|
||
|
|
||
|
The return value from container_of() is a pointer to the corresponding
|
||
|
container type. So, for example, a pointer ``kp`` to a struct kobject
|
||
|
embedded **within** a struct uio_map could be converted to a pointer to the
|
||
|
**containing** uio_map structure with::
|
||
|
|
||
|
struct uio_map *u_map = container_of(kp, struct uio_map, kobj);
|
||
|
|
||
|
For convenience, programmers often define a simple macro for **back-casting**
|
||
|
kobject pointers to the containing type. Exactly this happens in the
|
||
|
earlier ``drivers/uio/uio.c``, as you can see here::
|
||
|
|
||
|
struct uio_map {
|
||
|
struct kobject kobj;
|
||
|
struct uio_mem *mem;
|
||
|
};
|
||
|
|
||
|
#define to_map(map) container_of(map, struct uio_map, kobj)
|
||
|
|
||
|
where the macro argument "map" is a pointer to the struct kobject in
|
||
|
question. That macro is subsequently invoked with::
|
||
|
|
||
|
struct uio_map *map = to_map(kobj);
|
||
|
|
||
|
|
||
|
Initialization of kobjects
|
||
|
==========================
|
||
|
|
||
|
Code which creates a kobject must, of course, initialize that object. Some
|
||
|
of the internal fields are setup with a (mandatory) call to kobject_init()::
|
||
|
|
||
|
void kobject_init(struct kobject *kobj, const struct kobj_type *ktype);
|
||
|
|
||
|
The ktype is required for a kobject to be created properly, as every kobject
|
||
|
must have an associated kobj_type. After calling kobject_init(), to
|
||
|
register the kobject with sysfs, the function kobject_add() must be called::
|
||
|
|
||
|
int kobject_add(struct kobject *kobj, struct kobject *parent,
|
||
|
const char *fmt, ...);
|
||
|
|
||
|
This sets up the parent of the kobject and the name for the kobject
|
||
|
properly. If the kobject is to be associated with a specific kset,
|
||
|
kobj->kset must be assigned before calling kobject_add(). If a kset is
|
||
|
associated with a kobject, then the parent for the kobject can be set to
|
||
|
NULL in the call to kobject_add() and then the kobject's parent will be the
|
||
|
kset itself.
|
||
|
|
||
|
As the name of the kobject is set when it is added to the kernel, the name
|
||
|
of the kobject should never be manipulated directly. If you must change
|
||
|
the name of the kobject, call kobject_rename()::
|
||
|
|
||
|
int kobject_rename(struct kobject *kobj, const char *new_name);
|
||
|
|
||
|
kobject_rename() does not perform any locking or have a solid notion of
|
||
|
what names are valid so the caller must provide their own sanity checking
|
||
|
and serialization.
|
||
|
|
||
|
There is a function called kobject_set_name() but that is legacy cruft and
|
||
|
is being removed. If your code needs to call this function, it is
|
||
|
incorrect and needs to be fixed.
|
||
|
|
||
|
To properly access the name of the kobject, use the function
|
||
|
kobject_name()::
|
||
|
|
||
|
const char *kobject_name(const struct kobject * kobj);
|
||
|
|
||
|
There is a helper function to both initialize and add the kobject to the
|
||
|
kernel at the same time, called surprisingly enough kobject_init_and_add()::
|
||
|
|
||
|
int kobject_init_and_add(struct kobject *kobj, const struct kobj_type *ktype,
|
||
|
struct kobject *parent, const char *fmt, ...);
|
||
|
|
||
|
The arguments are the same as the individual kobject_init() and
|
||
|
kobject_add() functions described above.
|
||
|
|
||
|
|
||
|
Uevents
|
||
|
=======
|
||
|
|
||
|
After a kobject has been registered with the kobject core, you need to
|
||
|
announce to the world that it has been created. This can be done with a
|
||
|
call to kobject_uevent()::
|
||
|
|
||
|
int kobject_uevent(struct kobject *kobj, enum kobject_action action);
|
||
|
|
||
|
Use the **KOBJ_ADD** action for when the kobject is first added to the kernel.
|
||
|
This should be done only after any attributes or children of the kobject
|
||
|
have been initialized properly, as userspace will instantly start to look
|
||
|
for them when this call happens.
|
||
|
|
||
|
When the kobject is removed from the kernel (details on how to do that are
|
||
|
below), the uevent for **KOBJ_REMOVE** will be automatically created by the
|
||
|
kobject core, so the caller does not have to worry about doing that by
|
||
|
hand.
|
||
|
|
||
|
|
||
|
Reference counts
|
||
|
================
|
||
|
|
||
|
One of the key functions of a kobject is to serve as a reference counter
|
||
|
for the object in which it is embedded. As long as references to the object
|
||
|
exist, the object (and the code which supports it) must continue to exist.
|
||
|
The low-level functions for manipulating a kobject's reference counts are::
|
||
|
|
||
|
struct kobject *kobject_get(struct kobject *kobj);
|
||
|
void kobject_put(struct kobject *kobj);
|
||
|
|
||
|
A successful call to kobject_get() will increment the kobject's reference
|
||
|
counter and return the pointer to the kobject.
|
||
|
|
||
|
When a reference is released, the call to kobject_put() will decrement the
|
||
|
reference count and, possibly, free the object. Note that kobject_init()
|
||
|
sets the reference count to one, so the code which sets up the kobject will
|
||
|
need to do a kobject_put() eventually to release that reference.
|
||
|
|
||
|
Because kobjects are dynamic, they must not be declared statically or on
|
||
|
the stack, but instead, always allocated dynamically. Future versions of
|
||
|
the kernel will contain a run-time check for kobjects that are created
|
||
|
statically and will warn the developer of this improper usage.
|
||
|
|
||
|
If all that you want to use a kobject for is to provide a reference counter
|
||
|
for your structure, please use the struct kref instead; a kobject would be
|
||
|
overkill. For more information on how to use struct kref, please see the
|
||
|
file Documentation/core-api/kref.rst in the Linux kernel source tree.
|
||
|
|
||
|
|
||
|
Creating "simple" kobjects
|
||
|
==========================
|
||
|
|
||
|
Sometimes all that a developer wants is a way to create a simple directory
|
||
|
in the sysfs hierarchy, and not have to mess with the whole complication of
|
||
|
ksets, show and store functions, and other details. This is the one
|
||
|
exception where a single kobject should be created. To create such an
|
||
|
entry, use the function::
|
||
|
|
||
|
struct kobject *kobject_create_and_add(const char *name, struct kobject *parent);
|
||
|
|
||
|
This function will create a kobject and place it in sysfs in the location
|
||
|
underneath the specified parent kobject. To create simple attributes
|
||
|
associated with this kobject, use::
|
||
|
|
||
|
int sysfs_create_file(struct kobject *kobj, const struct attribute *attr);
|
||
|
|
||
|
or::
|
||
|
|
||
|
int sysfs_create_group(struct kobject *kobj, const struct attribute_group *grp);
|
||
|
|
||
|
Both types of attributes used here, with a kobject that has been created
|
||
|
with the kobject_create_and_add(), can be of type kobj_attribute, so no
|
||
|
special custom attribute is needed to be created.
|
||
|
|
||
|
See the example module, ``samples/kobject/kobject-example.c`` for an
|
||
|
implementation of a simple kobject and attributes.
|
||
|
|
||
|
|
||
|
|
||
|
ktypes and release methods
|
||
|
==========================
|
||
|
|
||
|
One important thing still missing from the discussion is what happens to a
|
||
|
kobject when its reference count reaches zero. The code which created the
|
||
|
kobject generally does not know when that will happen; if it did, there
|
||
|
would be little point in using a kobject in the first place. Even
|
||
|
predictable object lifecycles become more complicated when sysfs is brought
|
||
|
in as other portions of the kernel can get a reference on any kobject that
|
||
|
is registered in the system.
|
||
|
|
||
|
The end result is that a structure protected by a kobject cannot be freed
|
||
|
before its reference count goes to zero. The reference count is not under
|
||
|
the direct control of the code which created the kobject. So that code must
|
||
|
be notified asynchronously whenever the last reference to one of its
|
||
|
kobjects goes away.
|
||
|
|
||
|
Once you registered your kobject via kobject_add(), you must never use
|
||
|
kfree() to free it directly. The only safe way is to use kobject_put(). It
|
||
|
is good practice to always use kobject_put() after kobject_init() to avoid
|
||
|
errors creeping in.
|
||
|
|
||
|
This notification is done through a kobject's release() method. Usually
|
||
|
such a method has a form like::
|
||
|
|
||
|
void my_object_release(struct kobject *kobj)
|
||
|
{
|
||
|
struct my_object *mine = container_of(kobj, struct my_object, kobj);
|
||
|
|
||
|
/* Perform any additional cleanup on this object, then... */
|
||
|
kfree(mine);
|
||
|
}
|
||
|
|
||
|
One important point cannot be overstated: every kobject must have a
|
||
|
release() method, and the kobject must persist (in a consistent state)
|
||
|
until that method is called. If these constraints are not met, the code is
|
||
|
flawed. Note that the kernel will warn you if you forget to provide a
|
||
|
release() method. Do not try to get rid of this warning by providing an
|
||
|
"empty" release function.
|
||
|
|
||
|
If all your cleanup function needs to do is call kfree(), then you must
|
||
|
create a wrapper function which uses container_of() to upcast to the correct
|
||
|
type (as shown in the example above) and then calls kfree() on the overall
|
||
|
structure.
|
||
|
|
||
|
Note, the name of the kobject is available in the release function, but it
|
||
|
must NOT be changed within this callback. Otherwise there will be a memory
|
||
|
leak in the kobject core, which makes people unhappy.
|
||
|
|
||
|
Interestingly, the release() method is not stored in the kobject itself;
|
||
|
instead, it is associated with the ktype. So let us introduce struct
|
||
|
kobj_type::
|
||
|
|
||
|
struct kobj_type {
|
||
|
void (*release)(struct kobject *kobj);
|
||
|
const struct sysfs_ops *sysfs_ops;
|
||
|
const struct attribute_group **default_groups;
|
||
|
const struct kobj_ns_type_operations *(*child_ns_type)(struct kobject *kobj);
|
||
|
const void *(*namespace)(struct kobject *kobj);
|
||
|
void (*get_ownership)(struct kobject *kobj, kuid_t *uid, kgid_t *gid);
|
||
|
};
|
||
|
|
||
|
This structure is used to describe a particular type of kobject (or, more
|
||
|
correctly, of containing object). Every kobject needs to have an associated
|
||
|
kobj_type structure; a pointer to that structure must be specified when you
|
||
|
call kobject_init() or kobject_init_and_add().
|
||
|
|
||
|
The release field in struct kobj_type is, of course, a pointer to the
|
||
|
release() method for this type of kobject. The other two fields (sysfs_ops
|
||
|
and default_groups) control how objects of this type are represented in
|
||
|
sysfs; they are beyond the scope of this document.
|
||
|
|
||
|
The default_groups pointer is a list of default attributes that will be
|
||
|
automatically created for any kobject that is registered with this ktype.
|
||
|
|
||
|
|
||
|
ksets
|
||
|
=====
|
||
|
|
||
|
A kset is merely a collection of kobjects that want to be associated with
|
||
|
each other. There is no restriction that they be of the same ktype, but be
|
||
|
very careful if they are not.
|
||
|
|
||
|
A kset serves these functions:
|
||
|
|
||
|
- It serves as a bag containing a group of objects. A kset can be used by
|
||
|
the kernel to track "all block devices" or "all PCI device drivers."
|
||
|
|
||
|
- A kset is also a subdirectory in sysfs, where the associated kobjects
|
||
|
with the kset can show up. Every kset contains a kobject which can be
|
||
|
set up to be the parent of other kobjects; the top-level directories of
|
||
|
the sysfs hierarchy are constructed in this way.
|
||
|
|
||
|
- Ksets can support the "hotplugging" of kobjects and influence how
|
||
|
uevent events are reported to user space.
|
||
|
|
||
|
In object-oriented terms, "kset" is the top-level container class; ksets
|
||
|
contain their own kobject, but that kobject is managed by the kset code and
|
||
|
should not be manipulated by any other user.
|
||
|
|
||
|
A kset keeps its children in a standard kernel linked list. Kobjects point
|
||
|
back to their containing kset via their kset field. In almost all cases,
|
||
|
the kobjects belonging to a kset have that kset (or, strictly, its embedded
|
||
|
kobject) in their parent.
|
||
|
|
||
|
As a kset contains a kobject within it, it should always be dynamically
|
||
|
created and never declared statically or on the stack. To create a new
|
||
|
kset use::
|
||
|
|
||
|
struct kset *kset_create_and_add(const char *name,
|
||
|
const struct kset_uevent_ops *uevent_ops,
|
||
|
struct kobject *parent_kobj);
|
||
|
|
||
|
When you are finished with the kset, call::
|
||
|
|
||
|
void kset_unregister(struct kset *k);
|
||
|
|
||
|
to destroy it. This removes the kset from sysfs and decrements its reference
|
||
|
count. When the reference count goes to zero, the kset will be released.
|
||
|
Because other references to the kset may still exist, the release may happen
|
||
|
after kset_unregister() returns.
|
||
|
|
||
|
An example of using a kset can be seen in the
|
||
|
``samples/kobject/kset-example.c`` file in the kernel tree.
|
||
|
|
||
|
If a kset wishes to control the uevent operations of the kobjects
|
||
|
associated with it, it can use the struct kset_uevent_ops to handle it::
|
||
|
|
||
|
struct kset_uevent_ops {
|
||
|
int (* const filter)(struct kobject *kobj);
|
||
|
const char *(* const name)(struct kobject *kobj);
|
||
|
int (* const uevent)(struct kobject *kobj, struct kobj_uevent_env *env);
|
||
|
};
|
||
|
|
||
|
|
||
|
The filter function allows a kset to prevent a uevent from being emitted to
|
||
|
userspace for a specific kobject. If the function returns 0, the uevent
|
||
|
will not be emitted.
|
||
|
|
||
|
The name function will be called to override the default name of the kset
|
||
|
that the uevent sends to userspace. By default, the name will be the same
|
||
|
as the kset itself, but this function, if present, can override that name.
|
||
|
|
||
|
The uevent function will be called when the uevent is about to be sent to
|
||
|
userspace to allow more environment variables to be added to the uevent.
|
||
|
|
||
|
One might ask how, exactly, a kobject is added to a kset, given that no
|
||
|
functions which perform that function have been presented. The answer is
|
||
|
that this task is handled by kobject_add(). When a kobject is passed to
|
||
|
kobject_add(), its kset member should point to the kset to which the
|
||
|
kobject will belong. kobject_add() will handle the rest.
|
||
|
|
||
|
If the kobject belonging to a kset has no parent kobject set, it will be
|
||
|
added to the kset's directory. Not all members of a kset do necessarily
|
||
|
live in the kset directory. If an explicit parent kobject is assigned
|
||
|
before the kobject is added, the kobject is registered with the kset, but
|
||
|
added below the parent kobject.
|
||
|
|
||
|
|
||
|
Kobject removal
|
||
|
===============
|
||
|
|
||
|
After a kobject has been registered with the kobject core successfully, it
|
||
|
must be cleaned up when the code is finished with it. To do that, call
|
||
|
kobject_put(). By doing this, the kobject core will automatically clean up
|
||
|
all of the memory allocated by this kobject. If a ``KOBJ_ADD`` uevent has been
|
||
|
sent for the object, a corresponding ``KOBJ_REMOVE`` uevent will be sent, and
|
||
|
any other sysfs housekeeping will be handled for the caller properly.
|
||
|
|
||
|
If you need to do a two-stage delete of the kobject (say you are not
|
||
|
allowed to sleep when you need to destroy the object), then call
|
||
|
kobject_del() which will unregister the kobject from sysfs. This makes the
|
||
|
kobject "invisible", but it is not cleaned up, and the reference count of
|
||
|
the object is still the same. At a later time call kobject_put() to finish
|
||
|
the cleanup of the memory associated with the kobject.
|
||
|
|
||
|
kobject_del() can be used to drop the reference to the parent object, if
|
||
|
circular references are constructed. It is valid in some cases, that a
|
||
|
parent objects references a child. Circular references _must_ be broken
|
||
|
with an explicit call to kobject_del(), so that a release functions will be
|
||
|
called, and the objects in the former circle release each other.
|
||
|
|
||
|
|
||
|
Example code to copy from
|
||
|
=========================
|
||
|
|
||
|
For a more complete example of using ksets and kobjects properly, see the
|
||
|
example programs ``samples/kobject/{kobject-example.c,kset-example.c}``,
|
||
|
which will be built as loadable modules if you select ``CONFIG_SAMPLE_KOBJECT``.
|